Lecture 20 exercise - section 1

Lecture 20 exercise - section 1#

Live-coded solution.

Electricity data#

import pandas as pd

electricity = pd.read_csv("data/P_Data_Extract_From_World_Development_Indicators/0a6eb0d7-9c99-4d18-bf17-7839ff3ad1ed_Data.csv")
electricity
Series Name Series Code Country Name Country Code 1990 [YR1990] 2000 [YR2000] 2014 [YR2014] 2015 [YR2015] 2016 [YR2016] 2017 [YR2017] 2018 [YR2018] 2019 [YR2019] 2020 [YR2020] 2021 [YR2021] 2022 [YR2022] 2023 [YR2023]
0 Access to electricity (% of population) EG.ELC.ACCS.ZS Afghanistan AFG .. 4.4 89.5 71.5 97.7 97.7 93.4 97.7 97.7 97.7 85.3 ..
1 Access to electricity (% of population) EG.ELC.ACCS.ZS Albania ALB 100 99.4 100 100 99.9 99.9 100 100 100 100 100 ..
2 Access to electricity (% of population) EG.ELC.ACCS.ZS Algeria DZA .. 98.6 99.3 99.4 99.4 99.5 99.6 99.5 99.7 99.8 100 ..
3 Access to electricity (% of population) EG.ELC.ACCS.ZS American Samoa ASM .. .. .. .. .. .. .. .. .. .. .. ..
4 Access to electricity (% of population) EG.ELC.ACCS.ZS Andorra AND 100 100 100 100 100 100 100 100 100 100 100 ..
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
266 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
267 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
268 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
269 Data from database: World Development Indicators NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
270 Last Updated: 10/24/2024 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

271 rows × 16 columns

Clean#

electricity.fillna(0)
Series Name Series Code Country Name Country Code 1990 [YR1990] 2000 [YR2000] 2014 [YR2014] 2015 [YR2015] 2016 [YR2016] 2017 [YR2017] 2018 [YR2018] 2019 [YR2019] 2020 [YR2020] 2021 [YR2021] 2022 [YR2022] 2023 [YR2023]
0 Access to electricity (% of population) EG.ELC.ACCS.ZS Afghanistan AFG .. 4.4 89.5 71.5 97.7 97.7 93.4 97.7 97.7 97.7 85.3 ..
1 Access to electricity (% of population) EG.ELC.ACCS.ZS Albania ALB 100 99.4 100 100 99.9 99.9 100 100 100 100 100 ..
2 Access to electricity (% of population) EG.ELC.ACCS.ZS Algeria DZA .. 98.6 99.3 99.4 99.4 99.5 99.6 99.5 99.7 99.8 100 ..
3 Access to electricity (% of population) EG.ELC.ACCS.ZS American Samoa ASM .. .. .. .. .. .. .. .. .. .. .. ..
4 Access to electricity (% of population) EG.ELC.ACCS.ZS Andorra AND 100 100 100 100 100 100 100 100 100 100 100 ..
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
266 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
267 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
268 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
269 Data from database: World Development Indicators 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
270 Last Updated: 10/24/2024 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

271 rows × 16 columns

electricity[electricity["Country Code"] == "ASM"]
Series Name Series Code Country Name Country Code 1990 [YR1990] 2000 [YR2000] 2014 [YR2014] 2015 [YR2015] 2016 [YR2016] 2017 [YR2017] 2018 [YR2018] 2019 [YR2019] 2020 [YR2020] 2021 [YR2021] 2022 [YR2022] 2023 [YR2023]
3 Access to electricity (% of population) EG.ELC.ACCS.ZS American Samoa ASM .. .. .. .. .. .. .. .. .. .. .. ..
electricity["1990 [YR1990]"].value_counts(dropna=False)
1990 [YR1990]
..                  159
100                  78
NaN                   5
99.8443069458008      2
92.1548004150391      1
27.3                  1
99.9943911362469      1
99.546800179424       1
99.8425200685198      1
13.9                  1
88.7                  1
32.8                  1
91.3068161010742      1
99.9834289550781      1
70.19                 1
96.6253509521484      1
99.8532638549805      1
98.9990844726563      1
99.9413986206055      1
99.0415649414063      1
95.7630462646484      1
99.8828430175781      1
96.8                  1
70.334586             1
89.9                  1
92.257427             1
99.846923828125       1
94.606559753418       1
87.475116             1
99.9842529296875      1
99.8492813110352      1
Name: count, dtype: int64
electricity["1990 [YR1990]"].mean()
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[5], line 1
----> 1 electricity["1990 [YR1990]"].mean()

File /opt/homebrew/Caskroom/mambaforge/base/envs/computing-in-context/lib/python3.12/site-packages/pandas/core/series.py:6549, in Series.mean(self, axis, skipna, numeric_only, **kwargs)
   6541 @doc(make_doc("mean", ndim=1))
   6542 def mean(
   6543     self,
   (...)
   6547     **kwargs,
   6548 ):
-> 6549     return NDFrame.mean(self, axis, skipna, numeric_only, **kwargs)

File /opt/homebrew/Caskroom/mambaforge/base/envs/computing-in-context/lib/python3.12/site-packages/pandas/core/generic.py:12420, in NDFrame.mean(self, axis, skipna, numeric_only, **kwargs)
  12413 def mean(
  12414     self,
  12415     axis: Axis | None = 0,
   (...)
  12418     **kwargs,
  12419 ) -> Series | float:
> 12420     return self._stat_function(
  12421         "mean", nanops.nanmean, axis, skipna, numeric_only, **kwargs
  12422     )

File /opt/homebrew/Caskroom/mambaforge/base/envs/computing-in-context/lib/python3.12/site-packages/pandas/core/generic.py:12377, in NDFrame._stat_function(self, name, func, axis, skipna, numeric_only, **kwargs)
  12373 nv.validate_func(name, (), kwargs)
  12375 validate_bool_kwarg(skipna, "skipna", none_allowed=False)
> 12377 return self._reduce(
  12378     func, name=name, axis=axis, skipna=skipna, numeric_only=numeric_only
  12379 )

File /opt/homebrew/Caskroom/mambaforge/base/envs/computing-in-context/lib/python3.12/site-packages/pandas/core/series.py:6457, in Series._reduce(self, op, name, axis, skipna, numeric_only, filter_type, **kwds)
   6452     # GH#47500 - change to TypeError to match other methods
   6453     raise TypeError(
   6454         f"Series.{name} does not allow {kwd_name}={numeric_only} "
   6455         "with non-numeric dtypes."
   6456     )
-> 6457 return op(delegate, skipna=skipna, **kwds)

File /opt/homebrew/Caskroom/mambaforge/base/envs/computing-in-context/lib/python3.12/site-packages/pandas/core/nanops.py:147, in bottleneck_switch.__call__.<locals>.f(values, axis, skipna, **kwds)
    145         result = alt(values, axis=axis, skipna=skipna, **kwds)
    146 else:
--> 147     result = alt(values, axis=axis, skipna=skipna, **kwds)
    149 return result

File /opt/homebrew/Caskroom/mambaforge/base/envs/computing-in-context/lib/python3.12/site-packages/pandas/core/nanops.py:404, in _datetimelike_compat.<locals>.new_func(values, axis, skipna, mask, **kwargs)
    401 if datetimelike and mask is None:
    402     mask = isna(values)
--> 404 result = func(values, axis=axis, skipna=skipna, mask=mask, **kwargs)
    406 if datetimelike:
    407     result = _wrap_results(result, orig_values.dtype, fill_value=iNaT)

File /opt/homebrew/Caskroom/mambaforge/base/envs/computing-in-context/lib/python3.12/site-packages/pandas/core/nanops.py:719, in nanmean(values, axis, skipna, mask)
    716     dtype_count = dtype
    718 count = _get_counts(values.shape, mask, axis, dtype=dtype_count)
--> 719 the_sum = values.sum(axis, dtype=dtype_sum)
    720 the_sum = _ensure_numeric(the_sum)
    722 if axis is not None and getattr(the_sum, "ndim", False):

File /opt/homebrew/Caskroom/mambaforge/base/envs/computing-in-context/lib/python3.12/site-packages/numpy/_core/_methods.py:53, in _sum(a, axis, dtype, out, keepdims, initial, where)
     51 def _sum(a, axis=None, dtype=None, out=None, keepdims=False,
     52          initial=_NoValue, where=True):
---> 53     return umr_sum(a, axis, dtype, out, keepdims, initial, where)

TypeError: can only concatenate str (not "int") to str
electricity["1990 [YR1990]"].unique()
array(['..', '100', '95.7630462646484', '92.1548004150391',
       '99.9413986206055', '99.9842529296875', '87.475116',
       '94.606559753418', '99.846923828125', '92.257427', '89.9',
       '99.8443069458008', '70.334586', '96.8', '99.8828430175781',
       '99.8492813110352', '99.0415649414063', '98.9990844726563', '27.3',
       '99.8532638549805', '96.6253509521484', '70.19',
       '99.9834289550781', '91.3068161010742', '32.8', '88.7', '13.9',
       '99.8425200685198', '99.546800179424', '99.9943911362469', nan],
      dtype=object)
electricity["1990 [YR1990]"].astype(int)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[7], line 1
----> 1 electricity["1990 [YR1990]"].astype(int)

File /opt/homebrew/Caskroom/mambaforge/base/envs/computing-in-context/lib/python3.12/site-packages/pandas/core/generic.py:6643, in NDFrame.astype(self, dtype, copy, errors)
   6637     results = [
   6638         ser.astype(dtype, copy=copy, errors=errors) for _, ser in self.items()
   6639     ]
   6641 else:
   6642     # else, only a single dtype is given
-> 6643     new_data = self._mgr.astype(dtype=dtype, copy=copy, errors=errors)
   6644     res = self._constructor_from_mgr(new_data, axes=new_data.axes)
   6645     return res.__finalize__(self, method="astype")

File /opt/homebrew/Caskroom/mambaforge/base/envs/computing-in-context/lib/python3.12/site-packages/pandas/core/internals/managers.py:430, in BaseBlockManager.astype(self, dtype, copy, errors)
    427 elif using_copy_on_write():
    428     copy = False
--> 430 return self.apply(
    431     "astype",
    432     dtype=dtype,
    433     copy=copy,
    434     errors=errors,
    435     using_cow=using_copy_on_write(),
    436 )

File /opt/homebrew/Caskroom/mambaforge/base/envs/computing-in-context/lib/python3.12/site-packages/pandas/core/internals/managers.py:363, in BaseBlockManager.apply(self, f, align_keys, **kwargs)
    361         applied = b.apply(f, **kwargs)
    362     else:
--> 363         applied = getattr(b, f)(**kwargs)
    364     result_blocks = extend_blocks(applied, result_blocks)
    366 out = type(self).from_blocks(result_blocks, self.axes)

File /opt/homebrew/Caskroom/mambaforge/base/envs/computing-in-context/lib/python3.12/site-packages/pandas/core/internals/blocks.py:758, in Block.astype(self, dtype, copy, errors, using_cow, squeeze)
    755         raise ValueError("Can not squeeze with more than one column.")
    756     values = values[0, :]  # type: ignore[call-overload]
--> 758 new_values = astype_array_safe(values, dtype, copy=copy, errors=errors)
    760 new_values = maybe_coerce_values(new_values)
    762 refs = None

File /opt/homebrew/Caskroom/mambaforge/base/envs/computing-in-context/lib/python3.12/site-packages/pandas/core/dtypes/astype.py:237, in astype_array_safe(values, dtype, copy, errors)
    234     dtype = dtype.numpy_dtype
    236 try:
--> 237     new_values = astype_array(values, dtype, copy=copy)
    238 except (ValueError, TypeError):
    239     # e.g. _astype_nansafe can fail on object-dtype of strings
    240     #  trying to convert to float
    241     if errors == "ignore":

File /opt/homebrew/Caskroom/mambaforge/base/envs/computing-in-context/lib/python3.12/site-packages/pandas/core/dtypes/astype.py:182, in astype_array(values, dtype, copy)
    179     values = values.astype(dtype, copy=copy)
    181 else:
--> 182     values = _astype_nansafe(values, dtype, copy=copy)
    184 # in pandas we don't store numpy str dtypes, so convert to object
    185 if isinstance(dtype, np.dtype) and issubclass(values.dtype.type, str):

File /opt/homebrew/Caskroom/mambaforge/base/envs/computing-in-context/lib/python3.12/site-packages/pandas/core/dtypes/astype.py:133, in _astype_nansafe(arr, dtype, copy, skipna)
    129     raise ValueError(msg)
    131 if copy or arr.dtype == object or dtype == object:
    132     # Explicit copy, or required since NumPy can't view from / to object.
--> 133     return arr.astype(dtype, copy=True)
    135 return arr.astype(dtype, copy=copy)

ValueError: invalid literal for int() with base 10: '..'
import numpy as np

electricity["1990 [YR1990]"].replace({"..": np.nan}).value_counts()
1990 [YR1990]
100                 78
99.8443069458008     2
99.0415649414063     1
99.546800179424      1
99.8425200685198     1
13.9                 1
88.7                 1
32.8                 1
91.3068161010742     1
99.9834289550781     1
70.19                1
96.6253509521484     1
99.8532638549805     1
27.3                 1
98.9990844726563     1
99.8492813110352     1
95.7630462646484     1
99.8828430175781     1
96.8                 1
70.334586            1
89.9                 1
92.257427            1
99.846923828125      1
94.606559753418      1
87.475116            1
99.9842529296875     1
99.9413986206055     1
92.1548004150391     1
99.9943911362469     1
Name: count, dtype: int64
electricity["1990 [YR1990]"].replace({"..": np.nan}).astype(float)
0        NaN
1      100.0
2        NaN
3        NaN
4      100.0
       ...  
266      NaN
267      NaN
268      NaN
269      NaN
270      NaN
Name: 1990 [YR1990], Length: 271, dtype: float64
electricity_clean = electricity.replace({"..": np.nan})
electricity_clean
/var/folders/kr/nx0m1j811kz5vy8c87ffchzr0000gn/T/ipykernel_70664/1911216467.py:1: FutureWarning: Downcasting behavior in `replace` is deprecated and will be removed in a future version. To retain the old behavior, explicitly call `result.infer_objects(copy=False)`. To opt-in to the future behavior, set `pd.set_option('future.no_silent_downcasting', True)`
  electricity_clean = electricity.replace({
Series Name Series Code Country Name Country Code 1990 [YR1990] 2000 [YR2000] 2014 [YR2014] 2015 [YR2015] 2016 [YR2016] 2017 [YR2017] 2018 [YR2018] 2019 [YR2019] 2020 [YR2020] 2021 [YR2021] 2022 [YR2022] 2023 [YR2023]
0 Access to electricity (% of population) EG.ELC.ACCS.ZS Afghanistan AFG NaN 4.4 89.5 71.5 97.7 97.7 93.4 97.7 97.7 97.7 85.3 NaN
1 Access to electricity (% of population) EG.ELC.ACCS.ZS Albania ALB 100 99.4 100 100 99.9 99.9 100 100 100 100 100 NaN
2 Access to electricity (% of population) EG.ELC.ACCS.ZS Algeria DZA NaN 98.6 99.3 99.4 99.4 99.5 99.6 99.5 99.7 99.8 100 NaN
3 Access to electricity (% of population) EG.ELC.ACCS.ZS American Samoa ASM NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
4 Access to electricity (% of population) EG.ELC.ACCS.ZS Andorra AND 100 100 100 100 100 100 100 100 100 100 100 NaN
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
266 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
267 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
268 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
269 Data from database: World Development Indicators NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN
270 Last Updated: 10/24/2024 NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN NaN

271 rows × 16 columns

Reshape#

electricity_per_year = electricity_clean.melt(
    id_vars=["Series Name", "Series Code", "Country Name", "Country Code"],
    var_name="Year",
    value_name="Access to electricity (% pop)",
)
electricity_per_year
Series Name Series Code Country Name Country Code Year Access to electricity (% pop)
0 Access to electricity (% of population) EG.ELC.ACCS.ZS Afghanistan AFG 1990 [YR1990] NaN
1 Access to electricity (% of population) EG.ELC.ACCS.ZS Albania ALB 1990 [YR1990] 100
2 Access to electricity (% of population) EG.ELC.ACCS.ZS Algeria DZA 1990 [YR1990] NaN
3 Access to electricity (% of population) EG.ELC.ACCS.ZS American Samoa ASM 1990 [YR1990] NaN
4 Access to electricity (% of population) EG.ELC.ACCS.ZS Andorra AND 1990 [YR1990] 100
... ... ... ... ... ... ...
3247 NaN NaN NaN NaN 2023 [YR2023] NaN
3248 NaN NaN NaN NaN 2023 [YR2023] NaN
3249 NaN NaN NaN NaN 2023 [YR2023] NaN
3250 Data from database: World Development Indicators NaN NaN NaN 2023 [YR2023] NaN
3251 Last Updated: 10/24/2024 NaN NaN NaN 2023 [YR2023] NaN

3252 rows × 6 columns

Plot#

electricity_per_year["Access to electricity (% pop)"] = electricity_per_year["Access to electricity (% pop)"].astype(float)
import plotly.express as px

px.line(
    electricity_per_year,
    x="Year",
    y="Access to electricity (% pop)",
    color="Country Name",
    title="Access to electricity for each country by year",
)

Happiness data#

happiness = pd.read_excel("data/DataForTable2.1.xls")
happiness
Country name year Life Ladder Log GDP per capita Social support Healthy life expectancy at birth Freedom to make life choices Generosity Perceptions of corruption Positive affect Negative affect
0 Afghanistan 2008 3.723590 7.350416 0.450662 50.500000 0.718114 0.164055 0.881686 0.414297 0.258195
1 Afghanistan 2009 4.401778 7.508646 0.552308 50.799999 0.678896 0.187297 0.850035 0.481421 0.237092
2 Afghanistan 2010 4.758381 7.613900 0.539075 51.099998 0.600127 0.117861 0.706766 0.516907 0.275324
3 Afghanistan 2011 3.831719 7.581259 0.521104 51.400002 0.495901 0.160098 0.731109 0.479835 0.267175
4 Afghanistan 2012 3.782938 7.660506 0.520637 51.700001 0.530935 0.234157 0.775620 0.613513 0.267919
... ... ... ... ... ... ... ... ... ... ... ...
2358 Zimbabwe 2019 2.693523 7.697755 0.759162 53.099998 0.631908 -0.050874 0.830652 0.658434 0.235354
2359 Zimbabwe 2020 3.159802 7.596050 0.717243 53.575001 0.643303 0.002848 0.788523 0.660658 0.345736
2360 Zimbabwe 2021 3.154578 7.656878 0.685151 54.049999 0.667636 -0.079007 0.756945 0.609917 0.241682
2361 Zimbabwe 2022 3.296220 7.670073 0.666172 54.525002 0.651987 -0.072935 0.752632 0.640609 0.191350
2362 Zimbabwe 2023 3.572386 7.678590 0.693817 55.000000 0.734613 -0.068837 0.757494 0.609752 0.178953

2363 rows × 11 columns

Merge#

pd.merge(
    electricity_per_year,
    happiness,
    left_on=["Country Code", "Year"],
    right_on=["Country name", "year"],
)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In[15], line 1
----> 1 pd.merge(
      2     electricity_per_year,
      3     happiness,
      4     left_on=["Country Code", "Year"],
      5     right_on=["Country name", "year"]
      6 )

File /opt/homebrew/Caskroom/mambaforge/base/envs/computing-in-context/lib/python3.12/site-packages/pandas/core/reshape/merge.py:170, in merge(left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, copy, indicator, validate)
    155     return _cross_merge(
    156         left_df,
    157         right_df,
   (...)
    167         copy=copy,
    168     )
    169 else:
--> 170     op = _MergeOperation(
    171         left_df,
    172         right_df,
    173         how=how,
    174         on=on,
    175         left_on=left_on,
    176         right_on=right_on,
    177         left_index=left_index,
    178         right_index=right_index,
    179         sort=sort,
    180         suffixes=suffixes,
    181         indicator=indicator,
    182         validate=validate,
    183     )
    184     return op.get_result(copy=copy)

File /opt/homebrew/Caskroom/mambaforge/base/envs/computing-in-context/lib/python3.12/site-packages/pandas/core/reshape/merge.py:807, in _MergeOperation.__init__(self, left, right, how, on, left_on, right_on, left_index, right_index, sort, suffixes, indicator, validate)
    803 self._validate_tolerance(self.left_join_keys)
    805 # validate the merge keys dtypes. We may need to coerce
    806 # to avoid incompatible dtypes
--> 807 self._maybe_coerce_merge_keys()
    809 # If argument passed to validate,
    810 # check if columns specified as unique
    811 # are in fact unique.
    812 if validate is not None:

File /opt/homebrew/Caskroom/mambaforge/base/envs/computing-in-context/lib/python3.12/site-packages/pandas/core/reshape/merge.py:1508, in _MergeOperation._maybe_coerce_merge_keys(self)
   1502     # unless we are merging non-string-like with string-like
   1503     elif (
   1504         inferred_left in string_types and inferred_right not in string_types
   1505     ) or (
   1506         inferred_right in string_types and inferred_left not in string_types
   1507     ):
-> 1508         raise ValueError(msg)
   1510 # datetimelikes must match exactly
   1511 elif needs_i8_conversion(lk.dtype) and not needs_i8_conversion(rk.dtype):

ValueError: You are trying to merge on object and int64 columns for key 'Year'. If you wish to proceed you should use pd.concat

Clean#

electricity_per_year["Year"].unique()
array(['1990 [YR1990]', '2000 [YR2000]', '2014 [YR2014]', '2015 [YR2015]',
       '2016 [YR2016]', '2017 [YR2017]', '2018 [YR2018]', '2019 [YR2019]',
       '2020 [YR2020]', '2021 [YR2021]', '2022 [YR2022]', '2023 [YR2023]'],
      dtype=object)
electricity_per_year["Year"] = electricity_per_year["Year"].str.replace(r" \[YR\d{4}\]", "", regex=True).astype(int)
electricity_per_year
Series Name Series Code Country Name Country Code Year Access to electricity (% pop)
0 Access to electricity (% of population) EG.ELC.ACCS.ZS Afghanistan AFG 1990 NaN
1 Access to electricity (% of population) EG.ELC.ACCS.ZS Albania ALB 1990 100.0
2 Access to electricity (% of population) EG.ELC.ACCS.ZS Algeria DZA 1990 NaN
3 Access to electricity (% of population) EG.ELC.ACCS.ZS American Samoa ASM 1990 NaN
4 Access to electricity (% of population) EG.ELC.ACCS.ZS Andorra AND 1990 100.0
... ... ... ... ... ... ...
3247 NaN NaN NaN NaN 2023 NaN
3248 NaN NaN NaN NaN 2023 NaN
3249 NaN NaN NaN NaN 2023 NaN
3250 Data from database: World Development Indicators NaN NaN NaN 2023 NaN
3251 Last Updated: 10/24/2024 NaN NaN NaN 2023 NaN

3252 rows × 6 columns

pd.merge(
    electricity_per_year,
    happiness,
    left_on=["Country Code", "Year"],
    right_on=["Country name", "year"],
)
Series Name Series Code Country Name Country Code Year Access to electricity (% pop) Country name year Life Ladder Log GDP per capita Social support Healthy life expectancy at birth Freedom to make life choices Generosity Perceptions of corruption Positive affect Negative affect

Jump over to the end of official solution.