Estudiante: Pierina Milla
Importo la librería Pandas y leo los 3 archivos con la data que han sido subidos al repositorio Github.
import pandas as pd
co2Link='https://github.com/IntroSpatialDataScience/ClimateChangeData/raw/main/data/CO2_1990.csv'
precipitationLink = 'https://github.com/IntroSpatialDataScience/ClimateChangeData/raw/main/data/precipitation_mean.csv'
solarElectricityLink = 'https://github.com/IntroSpatialDataScience/ClimateChangeData/raw/main/data/solar_energy.csv'
co2=pd.read_csv(co2Link)
precipitation=pd.read_csv(precipitationLink)
solar = pd.read_csv(solarElectricityLink)
Realizo una copia de cada dataframe para trabajar sin modificar el archivo original
co2_new=co2.copy()
precipitation_new=precipitation.copy()
solar_new=solar.copy()
Cambio los nombres de las columnas referentes al territorio o país de cada DF para facilitar el merge entre ellos.
co2_new.rename(columns={'Country or Area':'Country'}, inplace=True)
precipitation_new.rename(columns={'Country or Territory':'Country'}, inplace=True)
solar_new.rename(columns={'Country or Area':'Country'}, inplace=True)
Toda la data en el mismo formato 'title':
precipitation_new.Country = precipitation_new.Country.str.title()
co2_new.Country = co2_new.Country.str.title()
solar_new.Country = solar_new.Country.str.title()
El número de filas y columnas de cada dataframe es:
co2_new.shape,precipitation_new.shape,solar_new.shape
((43, 3), (4263, 33), (205, 5))
co2_new.columns,precipitation_new.columns,solar_new.columns
(Index(['Country', 'Year', 'Value'], dtype='object'),
Index(['Country', 'Station Name', 'Period', 'Statistic Description', 'Unit',
'Jan', 'Jan Footnotes', 'Feb', 'Feb Footnotes', 'Mar', 'Mar Footnotes',
'Apr', 'Apr Footnotes', 'May', 'May Footnotes', 'Jun', 'Jun Footnotes',
'Jul', 'Jul Footnotes', 'Aug', 'Aug Footnotes', 'Sep', 'Sep Footnotes',
'Oct', 'Oct Footnotes', 'Nov', 'Nov Footnotes', 'Dec', 'Dec Footnotes',
'Annual', 'Annual Footnotes', 'Annual NCDC Computed Value',
'Annual NCDC Computed Value Footnotes'],
dtype='object'),
Index(['Country', 'Year', 'Unit', 'Quantity', 'Quantity Footnotes'], dtype='object'))
#Países que se encuentran solo en CO2 dataframe y no en solarDataframe
OnlyCO2=set(co2_new.Country)-set(solar_new.Country)
OnlyCO2
{'European Union', 'Iceland', 'Monaco', 'United States Of America'}
#Países que se encuentran solo en solar dataframe y no en CO2 dataframe
OnlySolar=set(solar_new.Country)-set(co2_new.Country)
OnlySolar
{'1',
'Afghanistan',
'Albania',
'Algeria',
'American Samoa',
'Andorra',
'Angola',
'Anguilla',
'Antigua And Barbuda',
'Argentina',
'Armenia',
'Aruba',
'Azerbaijan',
'Bahamas',
'Bahrain',
'Bangladesh',
'Barbados',
'Belize',
'Benin',
'Bermuda',
'Bhutan',
'Bolivia (Plur. State Of)',
'Bonaire, St Eustatius, Saba',
'Bosnia And Herzegovina',
'Botswana',
'Brazil',
'British Virgin Islands',
'Brunei Darussalam',
'Burkina Faso',
'Burundi',
'Cabo Verde',
'Cambodia',
'Cameroon',
'Central African Rep.',
'Chile',
'China',
'Colombia',
'Congo',
'Cook Islands',
'Costa Rica',
'Cuba',
'Curaçao',
"Côte D'Ivoire",
'Dem. Rep. Of The Congo',
'Djibouti',
'Dominica',
'Dominican Republic',
'Ecuador',
'Egypt',
'El Salvador',
'Eritrea',
'Eswatini',
'Ethiopia',
'Faeroe Islands',
'Falkland Is. (Malvinas)',
'Fiji',
'Fnseqid',
'French Polynesia',
'Gabon',
'Gambia',
'Ghana',
'Grenada',
'Guam',
'Guatemala',
'Guernsey',
'Guinea-Bissau',
'Guyana',
'Haiti',
'Honduras',
'India',
'Indonesia',
'Iran (Islamic Rep. Of)',
'Iraq',
'Israel',
'Jamaica',
'Jordan',
'Kazakhstan',
'Kenya',
'Kiribati',
"Korea, Dem.Ppl'S.Rep.",
'Korea, Republic Of',
'Kosovo',
'Kuwait',
"Lao People'S Dem. Rep.",
'Lebanon',
'Liberia',
'Libya',
'Madagascar',
'Malawi',
'Malaysia',
'Maldives',
'Mali',
'Marshall Islands',
'Mauritania',
'Mauritius',
'Mexico',
'Micronesia (Fed. States Of)',
'Mongolia',
'Montenegro',
'Montserrat',
'Morocco',
'Mozambique',
'Myanmar',
'Namibia',
'Nauru',
'Nepal',
'New Caledonia',
'Nicaragua',
'Niger',
'Nigeria',
'Niue',
'North Macedonia',
'Northern Mariana Islands',
'Oman',
'Other Asia',
'Pakistan',
'Palau',
'Panama',
'Papua New Guinea',
'Peru',
'Philippines',
'Puerto Rico',
'Republic Of Moldova',
'Rwanda',
'Samoa',
'Sao Tome And Principe',
'Saudi Arabia',
'Senegal',
'Serbia',
'Seychelles',
'Sierra Leone',
'Singapore',
'Solomon Islands',
'Somalia',
'South Africa',
'South Sudan',
'Sri Lanka',
'St. Helena And Depend.',
'St. Kitts-Nevis',
'St. Lucia',
'St. Vincent-Grenadines',
'State Of Palestine',
'Sudan',
'Suriname',
'Thailand',
'Timor-Leste',
'Togo',
'Tonga',
'Trinidad And Tobago',
'Tunisia',
'Turks And Caicos Islands',
'Tuvalu',
'Uganda',
'United Arab Emirates',
'United Rep. Of Tanzania',
'United States',
'United States Virgin Is.',
'Uruguay',
'Uzbekistan',
'Vanuatu',
'Venezuela (Bolivar. Rep.)',
'Viet Nam',
'Wallis And Futuna Is.',
'Yemen',
'Zambia',
'Zimbabwe'}
from thefuzz import process as fz
# Tomamos un pais de OnlySolar
# Miramos en un país de OnlyCO2 y retornamos el más similar
[(f,fz.extractOne(f, OnlySolar)) for f in sorted(OnlyCO2)]
[('European Union', ('Benin', 60)),
('Iceland', ('Thailand', 67)),
('Monaco', ('Morocco', 62)),
('United States Of America', ('United States', 90))]
Selecciono los mejores matches:
match_CO2_Solar = [(f,fz.extractOne(f, OnlySolar)) for f in sorted(OnlyCO2)
if fz.extractOne(f, OnlySolar)[1]>=87]
match_CO2_Solar= {f:fz.extractOne(f, OnlySolar)[0]
for f in sorted(OnlyCO2)
if fz.extractOne(f, OnlySolar)[1] >=87}
match_CO2_Solar
{'United States Of America': 'United States'}
Reemplazo los nuevos nombres en el dataframe de CO2
co2_new.Country.replace(to_replace=match_CO2_Solar,inplace=True)
co2_new
| Country | Year | Value | |
|---|---|---|---|
| 0 | Australia | 1990 | 2.781542e+05 |
| 1 | Austria | 1990 | 6.214525e+04 |
| 2 | Belarus | 1990 | 1.083451e+05 |
| 3 | Belgium | 1990 | 1.202926e+05 |
| 4 | Bulgaria | 1990 | 7.669920e+04 |
| 5 | Canada | 1990 | 4.582183e+05 |
| 6 | Croatia | 1990 | 2.297979e+04 |
| 7 | Cyprus | 1990 | 4.653217e+03 |
| 8 | Czechia | 1990 | 1.642107e+05 |
| 9 | Denmark | 1990 | 5.487931e+04 |
| 10 | Estonia | 1990 | 3.692221e+04 |
| 11 | European Union | 1990 | 4.470262e+06 |
| 12 | Finland | 1990 | 5.691434e+04 |
| 13 | France | 1990 | 4.009646e+05 |
| 14 | Germany | 1990 | 1.051979e+06 |
| 15 | Greece | 1990 | 8.343804e+04 |
| 16 | Hungary | 1990 | 7.322554e+04 |
| 17 | Iceland | 1990 | 2.215856e+03 |
| 18 | Ireland | 1990 | 3.294442e+04 |
| 19 | Italy | 1990 | 4.395498e+05 |
| 20 | Japan | 1990 | 1.158129e+06 |
| 21 | Latvia | 1990 | 1.966140e+04 |
| 22 | Liechtenstein | 1990 | 1.989709e+02 |
| 23 | Lithuania | 1990 | 3.576773e+04 |
| 24 | Luxembourg | 1990 | 1.182335e+04 |
| 25 | Malta | 1990 | 2.394194e+03 |
| 26 | Monaco | 1990 | 9.822834e+01 |
| 27 | Netherlands | 1990 | 1.618069e+05 |
| 28 | New Zealand | 1990 | 2.550251e+04 |
| 29 | Norway | 1990 | 3.509663e+04 |
| 30 | Poland | 1990 | 3.768136e+05 |
| 31 | Portugal | 1990 | 4.532509e+04 |
| 32 | Romania | 1990 | 1.734639e+05 |
| 33 | Russian Federation | 1990 | 2.534865e+06 |
| 34 | Slovakia | 1990 | 6.147019e+04 |
| 35 | Slovenia | 1990 | 1.509485e+04 |
| 36 | Spain | 1990 | 2.313284e+05 |
| 37 | Sweden | 1990 | 5.758009e+04 |
| 38 | Switzerland | 1990 | 4.415988e+04 |
| 39 | Türkiye | 1990 | 1.516645e+05 |
| 40 | Ukraine | 1990 | 7.058300e+05 |
| 41 | United Kingdom | 1990 | 6.026521e+05 |
| 42 | United States | 1990 | 5.122496e+06 |
Merge entre dataframe de CO2 y SolarEnergy manteniendo lo que es común en ambas columnas
co2_solar = co2_new.merge(solar_new,how='inner',left_on='Country',right_on='Country')
co2_solar
| Country | Year_x | Value | Year_y | Unit | Quantity | Quantity Footnotes | |
|---|---|---|---|---|---|---|---|
| 0 | Australia | 1990 | 2.781542e+05 | 2020 | Kilowatt-hours, million | 21033.194 | NaN |
| 1 | Austria | 1990 | 6.214525e+04 | 2020 | Kilowatt-hours, million | 2042.934 | NaN |
| 2 | Belarus | 1990 | 1.083451e+05 | 2020 | Kilowatt-hours, million | 176.000 | NaN |
| 3 | Belgium | 1990 | 1.202926e+05 | 2020 | Kilowatt-hours, million | 5105.400 | NaN |
| 4 | Bulgaria | 1990 | 7.669920e+04 | 2020 | Kilowatt-hours, million | 1480.856 | NaN |
| 5 | Canada | 1990 | 4.582183e+05 | 2020 | Kilowatt-hours, million | 4846.000 | NaN |
| 6 | Croatia | 1990 | 2.297979e+04 | 2020 | Kilowatt-hours, million | 95.500 | NaN |
| 7 | Cyprus | 1990 | 4.653217e+03 | 2020 | Kilowatt-hours, million | 295.608 | NaN |
| 8 | Czechia | 1990 | 1.642107e+05 | 2020 | Kilowatt-hours, million | 2287.014 | NaN |
| 9 | Denmark | 1990 | 5.487931e+04 | 2020 | Kilowatt-hours, million | 1180.532 | NaN |
| 10 | Estonia | 1990 | 3.692221e+04 | 2020 | Kilowatt-hours, million | 122.540 | NaN |
| 11 | Finland | 1990 | 5.691434e+04 | 2020 | Kilowatt-hours, million | 218.471 | NaN |
| 12 | France | 1990 | 4.009646e+05 | 2020 | Kilowatt-hours, million | 13398.392 | NaN |
| 13 | Germany | 1990 | 1.051979e+06 | 2020 | Kilowatt-hours, million | 48641.000 | NaN |
| 14 | Greece | 1990 | 8.343804e+04 | 2020 | Kilowatt-hours, million | 4446.854 | NaN |
| 15 | Hungary | 1990 | 7.322554e+04 | 2020 | Kilowatt-hours, million | 2459.000 | NaN |
| 16 | Ireland | 1990 | 3.294442e+04 | 2020 | Kilowatt-hours, million | 63.780 | NaN |
| 17 | Italy | 1990 | 4.395498e+05 | 2020 | Kilowatt-hours, million | 24941.504 | NaN |
| 18 | Japan | 1990 | 1.158129e+06 | 2020 | Kilowatt-hours, million | 79086.858 | NaN |
| 19 | Latvia | 1990 | 1.966140e+04 | 2020 | Kilowatt-hours, million | 4.847 | NaN |
| 20 | Liechtenstein | 1990 | 1.989709e+02 | 2020 | Kilowatt-hours, million | 29.700 | NaN |
| 21 | Lithuania | 1990 | 3.576773e+04 | 2020 | Kilowatt-hours, million | 128.800 | NaN |
| 22 | Luxembourg | 1990 | 1.182335e+04 | 2020 | Kilowatt-hours, million | 161.261 | NaN |
| 23 | Malta | 1990 | 2.394194e+03 | 2020 | Kilowatt-hours, million | 236.837 | NaN |
| 24 | Netherlands | 1990 | 1.618069e+05 | 2020 | Kilowatt-hours, million | 8765.172 | NaN |
| 25 | New Zealand | 1990 | 2.550251e+04 | 2020 | Kilowatt-hours, million | 160.216 | NaN |
| 26 | Norway | 1990 | 3.509663e+04 | 2020 | Kilowatt-hours, million | 26.521 | NaN |
| 27 | Poland | 1990 | 3.768136e+05 | 2020 | Kilowatt-hours, million | 1957.916 | NaN |
| 28 | Portugal | 1990 | 4.532509e+04 | 2020 | Kilowatt-hours, million | 1715.590 | NaN |
| 29 | Romania | 1990 | 1.734639e+05 | 2020 | Kilowatt-hours, million | 1733.375 | NaN |
| 30 | Russian Federation | 1990 | 2.534865e+06 | 2020 | Kilowatt-hours, million | 2022.400 | NaN |
| 31 | Slovakia | 1990 | 6.147019e+04 | 2020 | Kilowatt-hours, million | 663.000 | NaN |
| 32 | Slovenia | 1990 | 1.509485e+04 | 2020 | Kilowatt-hours, million | 368.196 | NaN |
| 33 | Spain | 1990 | 2.313284e+05 | 2020 | Kilowatt-hours, million | 20667.000 | NaN |
| 34 | Sweden | 1990 | 5.758009e+04 | 2020 | Kilowatt-hours, million | 1051.000 | NaN |
| 35 | Switzerland | 1990 | 4.415988e+04 | 2020 | Kilowatt-hours, million | 2598.659 | NaN |
| 36 | Türkiye | 1990 | 1.516645e+05 | 2020 | Kilowatt-hours, million | 10950.180 | NaN |
| 37 | Ukraine | 1990 | 7.058300e+05 | 2020 | Kilowatt-hours, million | 5969.400 | NaN |
| 38 | United Kingdom | 1990 | 6.026521e+05 | 2020 | Kilowatt-hours, million | 13157.994 | NaN |
| 39 | United States | 1990 | 5.122496e+06 | 2020 | Kilowatt-hours, million | 119328.543 | NaN |
OnlyCO2_Solar=set(co2_solar.Country)-set(precipitation_new.Country)
OnlyCO2_Solar
{'Czechia',
'Liechtenstein',
'Russian Federation',
'Türkiye',
'United Kingdom',
'United States'}
OnlyPrep=set(precipitation_new.Country)-set(co2_solar.Country)
OnlyPrep
{'1',
'2',
'3',
'Afghanistan, Islamic State Of',
'Albania',
'Algeria',
'Argentina',
'Argentina (Antarctic Stations)',
'Armenia',
'Australia (Antarctic Stations)',
'Azerbaijan',
'Bahamas',
'Bahrain',
'Belize',
'Benin',
'Bosnia And Herzegovina',
'Brazil',
'Brunei Darussalam',
'Cameroon',
'Cape Verde',
'Chad',
'Chile',
'Chile (Antarctic Stations)',
'China',
'Colombia',
'Colombia (San Andres And Providencia Islands)',
'Costa Rica',
'Cuba',
'Czech Republic',
'Djibouti',
'Dominica',
'Dominican Republic',
'Ecuador',
'Egypt',
'El Salvador',
'Eritrea',
'Fiji',
'Footnoteseqid',
'France (Caribbean Islands, Guadeloupe, Martinique)',
'France (French Department Of Guyana)',
'France (Islands In The Indian Ocean)',
'French Polynesia',
'Gabon',
'Georgia',
'Greenland',
'Guinea',
'Guyana',
'Honduras',
'Hong Kong',
'Iceland',
'India',
'Iran, Islamic Republic Of',
'Israel',
'Ivory Coast',
'Jordan',
'Kazakstan (Asia)',
'Kazakstan (Europe)',
'Kenya',
"Korea, Democratic People'S Republic Of",
'Korea, Republic Of',
'Kuwait',
'Kyrgyz Republic',
"Lao People'S Democratic Republic",
'Lebanon',
'Madagascar',
'Malawi',
'Malaysia',
'Maldives',
'Mali',
'Mauritius',
'Mexico',
'Moldova, Republic Of',
'Mongolia',
'Morocco',
'Myanmar',
'Netherlands Antilles And Aruba',
'New Caledonia',
'Nicaragua',
'Niger',
'Nigeria',
'Oman',
'Pakistan',
'Paraguay',
'Peru',
'Philippines',
'Portugal (Madeira)',
'Qatar',
'Russian Federation (Asia)',
'Russian Federation (Europe)',
'Rwanda',
'Saudi Arabia',
'Senegal',
'Seychelles',
'Sierra Leone',
'Singapore',
'Solomon Islands',
'South Africa',
'Spain (Canary Islands, Ceuta And Melilla)',
'Sri Lanka',
'Sudan',
'Syrian Arab Republic',
'Tajikistan',
'Tanzania, United Republic Of',
'Thailand',
'The Former Yugoslav Republic Of Macedonia',
'Togo',
'Trinidad And Tobago',
'Tunisia',
'Turkey',
'Turkmenistan',
'United Arab Emirates',
'United Kingdom Of Great Britain & Northern Ireland',
'United States Of America',
'United States Of America (Antarctic Stations)',
'United States Of America (Pacific Islands)',
'Uruguay',
'Uzbekistan',
'Venezuela',
'Yugoslavia',
'Zambia',
'Zimbabwe'}
[(f,fz.extractOne(f, OnlyCO2_Solar)) for f in sorted(OnlyPrep)]
[('1', ('United Kingdom', 0)),
('2', ('United Kingdom', 0)),
('3', ('United Kingdom', 0)),
('Afghanistan, Islamic State Of', ('United States', 52)),
('Albania', ('Liechtenstein', 40)),
('Algeria', ('Czechia', 43)),
('Argentina', ('Russian Federation', 51)),
('Argentina (Antarctic Stations)', ('United States', 52)),
('Armenia', ('Czechia', 43)),
('Australia (Antarctic Stations)', ('United States', 52)),
('Azerbaijan', ('Russian Federation', 53)),
('Bahamas', ('United States', 33)),
('Bahrain', ('Russian Federation', 55)),
('Belize', ('Liechtenstein', 60)),
('Benin', ('Liechtenstein', 68)),
('Bosnia And Herzegovina', ('Russian Federation', 45)),
('Brazil', ('Russian Federation', 45)),
('Brunei Darussalam', ('Russian Federation', 43)),
('Cameroon', ('Russian Federation', 48)),
('Cape Verde', ('Russian Federation', 45)),
('Chad', ('Czechia', 68)),
('Chile', ('Liechtenstein', 54)),
('Chile (Antarctic Stations)', ('Czechia', 54)),
('China', ('Czechia', 67)),
('Colombia', ('Czechia', 40)),
('Colombia (San Andres And Providencia Islands)', ('Czechia', 51)),
('Costa Rica', ('Russian Federation', 43)),
('Cuba', ('Czechia', 45)),
('Czech Republic', ('Czechia', 75)),
('Djibouti', ('United Kingdom', 43)),
('Dominica', ('United Kingdom', 53)),
('Dominican Republic', ('Russian Federation', 37)),
('Ecuador', ('Czechia', 43)),
('Egypt', ('Liechtenstein', 36)),
('El Salvador', ('United States', 33)),
('Eritrea', ('United Kingdom', 47)),
('Fiji', ('Russian Federation', 34)),
('Footnoteseqid', ('United States', 44)),
('France (Caribbean Islands, Guadeloupe, Martinique)', ('Czechia', 51)),
('France (French Department Of Guyana)', ('Russian Federation', 47)),
('France (Islands In The Indian Ocean)', ('United Kingdom', 51)),
('French Polynesia', ('Czechia', 49)),
('Gabon', ('Russian Federation', 54)),
('Georgia', ('Czechia', 43)),
('Greenland', ('Russian Federation', 40)),
('Guinea', ('United Kingdom', 49)),
('Guyana', ('United Kingdom', 45)),
('Honduras', ('Russian Federation', 45)),
('Hong Kong', ('United Kingdom', 43)),
('Iceland', ('Liechtenstein', 51)),
('India', ('United Kingdom', 54)),
('Iran, Islamic Republic Of', ('Russian Federation', 36)),
('Israel', ('United States', 47)),
('Ivory Coast', ('Russian Federation', 39)),
('Jordan', ('Russian Federation', 45)),
('Kazakstan (Asia)', ('Czechia', 40)),
('Kazakstan (Europe)', ('Russian Federation', 40)),
('Kenya', ('Liechtenstein', 45)),
("Korea, Democratic People'S Republic Of", ('Russian Federation', 48)),
('Korea, Republic Of', ('Russian Federation', 38)),
('Kuwait', ('United Kingdom', 54)),
('Kyrgyz Republic', ('Türkiye', 40)),
("Lao People'S Democratic Republic", ('Russian Federation', 44)),
('Lebanon', ('Russian Federation', 51)),
('Madagascar', ('Russian Federation', 27)),
('Malawi', ('Liechtenstein', 45)),
('Malaysia', ('Russian Federation', 43)),
('Maldives', ('Liechtenstein', 49)),
('Mali', ('Liechtenstein', 60)),
('Mauritius', ('Russian Federation', 48)),
('Mexico', ('Russian Federation', 45)),
('Moldova, Republic Of', ('Russian Federation', 36)),
('Mongolia', ('United Kingdom', 42)),
('Morocco', ('Czechia', 29)),
('Myanmar', ('Russian Federation', 38)),
('Netherlands Antilles And Aruba', ('United States', 48)),
('New Caledonia', ('Russian Federation', 43)),
('Nicaragua', ('Russian Federation', 38)),
('Niger', ('United Kingdom', 54)),
('Nigeria', ('United Kingdom', 45)),
('Oman', ('United Kingdom', 60)),
('Pakistan', ('United Kingdom', 47)),
('Paraguay', ('Russian Federation', 39)),
('Peru', ('Russian Federation', 60)),
('Philippines', ('Czechia', 36)),
('Portugal (Madeira)', ('Czechia', 49)),
('Qatar', ('United States', 43)),
('Russian Federation (Asia)', ('Russian Federation', 95)),
('Russian Federation (Europe)', ('Russian Federation', 95)),
('Rwanda', ('Russian Federation', 45)),
('Saudi Arabia', ('Russian Federation', 43)),
('Senegal', ('Liechtenstein', 45)),
('Seychelles', ('Liechtenstein', 43)),
('Sierra Leone', ('Russian Federation', 47)),
('Singapore', ('United Kingdom', 48)),
('Solomon Islands', ('United Kingdom', 39)),
('South Africa', ('Czechia', 40)),
('Spain (Canary Islands, Ceuta And Melilla)', ('Czechia', 42)),
('Sri Lanka', ('Russian Federation', 45)),
('Sudan', ('Russian Federation', 57)),
('Syrian Arab Republic', ('Russian Federation', 45)),
('Tajikistan', ('Russian Federation', 39)),
('Tanzania, United Republic Of', ('United Kingdom', 86)),
('Thailand', ('Russian Federation', 47)),
('The Former Yugoslav Republic Of Macedonia', ('Czechia', 51)),
('Togo', ('United Kingdom', 45)),
('Trinidad And Tobago', ('Türkiye', 54)),
('Tunisia', ('Russian Federation', 55)),
('Turkey', ('Türkiye', 62)),
('Turkmenistan', ('Türkiye', 55)),
('United Arab Emirates', ('United States', 86)),
('United Kingdom Of Great Britain & Northern Ireland',
('United Kingdom', 90)),
('United States Of America', ('United States', 90)),
('United States Of America (Antarctic Stations)', ('United States', 90)),
('United States Of America (Pacific Islands)', ('United States', 90)),
('Uruguay', ('Russian Federation', 42)),
('Uzbekistan', ('Czechia', 47)),
('Venezuela', ('Russian Federation', 40)),
('Yugoslavia', ('Russian Federation', 45)),
('Zambia', ('Czechia', 46)),
('Zimbabwe', ('Czechia', 40))]
[(f,fz.extractOne(f, OnlyPrep)) for f in sorted(OnlyCO2_Solar)]
#Notamos que al extraer los matchs de ambas formas se pierden datos si se toman valores >=87 o >=75 o >=62
#En algunos casos se toma datos incorrectos y en otros se obvia datos, por tanto los datos faltantes se modificarán manualmente
[('Czechia', ('Czech Republic', 75)),
('Liechtenstein', ('Benin', 68)),
('Russian Federation', ('Russian Federation (Asia)', 95)),
('Türkiye', ('Turkey', 62)),
('United Kingdom',
('United Kingdom Of Great Britain & Northern Ireland', 90)),
('United States', ('United States Of America (Pacific Islands)', 90))]
match_CO2_Solar_Prep = [(f,fz.extractOne(f, OnlyCO2_Solar)) for f in sorted(OnlyPrep)
if fz.extractOne(f, OnlyCO2_Solar)[1]>=87]
match_CO2_Solar_Prep
[('Russian Federation (Asia)', ('Russian Federation', 95)),
('Russian Federation (Europe)', ('Russian Federation', 95)),
('United Kingdom Of Great Britain & Northern Ireland',
('United Kingdom', 90)),
('United States Of America', ('United States', 90)),
('United States Of America (Antarctic Stations)', ('United States', 90)),
('United States Of America (Pacific Islands)', ('United States', 90))]
match_CO2_Solar_Prep= {f:fz.extractOne(f, OnlyCO2_Solar)[0]
for f in sorted(OnlyPrep)
if fz.extractOne(f, OnlyCO2_Solar)[1] >=87}
match_CO2_Solar_Prep
{'Russian Federation (Asia)': 'Russian Federation',
'Russian Federation (Europe)': 'Russian Federation',
'United Kingdom Of Great Britain & Northern Ireland': 'United Kingdom',
'United States Of America': 'United States',
'United States Of America (Antarctic Stations)': 'United States',
'United States Of America (Pacific Islands)': 'United States'}
precipitation_new.Country.replace(to_replace=match_CO2_Solar_Prep,inplace=True)
precipitation_new.Country=precipitation_new.Country.str.replace('Czech Republic', 'Czechia')
precipitation_new.loc[precipitation_new['Country']=='Czechia']
| Country | Station Name | Period | Statistic Description | Unit | Jan | Jan Footnotes | Feb | Feb Footnotes | Mar | ... | Oct | Oct Footnotes | Nov | Nov Footnotes | Dec | Dec Footnotes | Annual | Annual Footnotes | Annual NCDC Computed Value | Annual NCDC Computed Value Footnotes | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 888 | Czechia | Cheb | 1961-1990 | Mean Monthly Value | mm | 36.1 | NaN | 29.5 | NaN | 36.3 | ... | 37.5 | NaN | 41.1 | NaN | 43.9 | NaN | 559.8 | NaN | 559.7 | NaN |
| 889 | Czechia | Praha 6, Ruzyne | 1961-1990 | Mean Monthly Value | mm | 23.6 | NaN | 22.6 | NaN | 28.1 | ... | 30.5 | NaN | 31.9 | NaN | 25.3 | NaN | 526.2 | NaN | 526.3 | NaN |
| 890 | Czechia | Brno 4, Turany | 1961-1990 | Mean Monthly Value | mm | 24.5 | NaN | 23.7 | NaN | 24.2 | ... | 30.5 | NaN | 37.5 | NaN | 27.1 | NaN | 490.1 | NaN | 490.0 | NaN |
| 891 | Czechia | Mosnov, Ostrava | 1961-1990 | Mean Monthly Value | mm | 26.7 | NaN | 30.2 | NaN | 34.0 | ... | 42.3 | NaN | 44.6 | NaN | 34.3 | NaN | 701.8 | NaN | 701.8 | NaN |
4 rows × 33 columns
precipitation_new.Country=precipitation_new.Country.str.replace('Turkey', 'Türkiye')
precipitation_new.loc[precipitation_new['Country']=='Türkiye']
| Country | Station Name | Period | Statistic Description | Unit | Jan | Jan Footnotes | Feb | Feb Footnotes | Mar | ... | Oct | Oct Footnotes | Nov | Nov Footnotes | Dec | Dec Footnotes | Annual | Annual Footnotes | Annual NCDC Computed Value | Annual NCDC Computed Value Footnotes | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 3088 | Türkiye | ZONGULDAK | 1961-1990 | Mean Monthly Value | mm | 136.9 | NaN | 91.6 | NaN | 89.8 | ... | 139.1 | NaN | 142.3 | NaN | 159.3 | NaN | -9999.9 | 1.0 | 1194.1 | NaN |
| 3089 | Türkiye | SAMSUN | 1961-1990 | Mean Monthly Value | mm | 61.4 | NaN | 50.0 | NaN | 55.8 | ... | 84.7 | NaN | 89.3 | NaN | 82.2 | NaN | -9999.9 | 1.0 | 691.5 | NaN |
| 3090 | Türkiye | GIRESUN | 1961-1990 | Mean Monthly Value | mm | 121.6 | NaN | 89.2 | NaN | 88.9 | ... | 159.6 | NaN | 142.7 | NaN | 131.3 | NaN | -9999.9 | 1.0 | 1236.1 | NaN |
| 3091 | Türkiye | RIZE | 1961-1990 | Mean Monthly Value | mm | 216.5 | NaN | 172.7 | NaN | 147.1 | ... | 272.2 | NaN | 249.5 | NaN | 242.5 | NaN | -9999.9 | 1.0 | 2171.0 | NaN |
| 3092 | Türkiye | EDIRNE | 1961-1990 | Mean Monthly Value | mm | 61.4 | NaN | 53.3 | NaN | 55.0 | ... | 44.8 | NaN | 64.0 | NaN | 70.6 | NaN | -9999.9 | 1.0 | 577.6 | NaN |
| 3093 | Türkiye | GOZTEPE | 1961-1990 | Mean Monthly Value | mm | 98.7 | NaN | 66.9 | NaN | 62.1 | ... | 70.7 | NaN | 88.9 | NaN | 121.7 | NaN | -9999.9 | 1.0 | 696.7 | NaN |
| 3094 | Türkiye | KASTAMONU | 1961-1990 | Mean Monthly Value | mm | 33.2 | NaN | 27.6 | NaN | 32.9 | ... | 39.3 | NaN | 29.3 | NaN | 37.6 | NaN | -9999.9 | 1.0 | 483.6 | NaN |
| 3095 | Türkiye | CORUM | 1961-1990 | Mean Monthly Value | mm | 41.9 | NaN | 31.3 | NaN | 36.0 | ... | 30.4 | NaN | 36.5 | NaN | 50.7 | NaN | -9999.9 | 1.0 | 449.2 | NaN |
| 3096 | Türkiye | SIVAS | 1961-1990 | Mean Monthly Value | mm | 41.4 | NaN | 36.6 | NaN | 46.1 | ... | 32.1 | NaN | 38.9 | NaN | 47.2 | NaN | -9999.9 | 1.0 | 427.4 | NaN |
| 3097 | Türkiye | ERZINCAN | 1961-1990 | Mean Monthly Value | mm | 29.4 | NaN | 29.2 | NaN | 39.1 | ... | 41.9 | NaN | 39.2 | NaN | 31.3 | NaN | -9999.9 | 1.0 | 381.2 | NaN |
| 3098 | Türkiye | ERZURUM | 1961-1990 | Mean Monthly Value | mm | 21.3 | NaN | 25.3 | NaN | 29.2 | ... | 45.8 | NaN | 36.9 | NaN | 21.3 | NaN | -9999.9 | 1.0 | 414.1 | NaN |
| 3099 | Türkiye | KARS | 1961-1990 | Mean Monthly Value | mm | 18.8 | NaN | 22.4 | NaN | 26.0 | ... | 38.5 | NaN | 25.7 | NaN | 20.8 | NaN | -9999.9 | 1.0 | 462.9 | NaN |
| 3100 | Türkiye | CANAKKALE | 1961-1990 | Mean Monthly Value | mm | 97.4 | NaN | 66.2 | NaN | 61.1 | ... | 47.8 | NaN | 88.8 | NaN | 117.5 | NaN | -9999.9 | 1.0 | 634.1 | NaN |
| 3101 | Türkiye | BURSA | 1961-1990 | Mean Monthly Value | mm | 89.6 | NaN | 85.8 | NaN | 64.0 | ... | 58.0 | NaN | 80.6 | NaN | 118.4 | NaN | -9999.9 | 1.0 | 706.1 | NaN |
| 3102 | Türkiye | ESKISEHIR | 1961-1990 | Mean Monthly Value | mm | 45.2 | NaN | 33.9 | NaN | 40.5 | ... | 28.8 | NaN | 31.0 | NaN | 50.8 | NaN | -9999.9 | 1.0 | 387.8 | NaN |
| 3103 | Türkiye | ANKARA | 1961-1990 | Mean Monthly Value | mm | 47.0 | NaN | 36.3 | NaN | 36.3 | ... | 26.8 | NaN | 33.4 | NaN | 49.0 | NaN | -9999.9 | 1.0 | 414.6 | NaN |
| 3104 | Türkiye | BALIKESIR | 1961-1990 | Mean Monthly Value | mm | 87.8 | NaN | 63.6 | NaN | 55.6 | ... | 38.0 | NaN | 78.8 | NaN | 107.8 | NaN | -9999.9 | 1.0 | 587.2 | NaN |
| 3105 | Türkiye | VAN | 1961-1990 | Mean Monthly Value | mm | 34.8 | NaN | 33.9 | NaN | 42.3 | ... | 48.3 | NaN | 42.3 | NaN | 33.5 | NaN | -9999.9 | 1.0 | 385.1 | NaN |
| 3106 | Türkiye | USAK | 1961-1990 | Mean Monthly Value | mm | 75.2 | NaN | 65.5 | NaN | 57.3 | ... | 36.1 | NaN | 58.5 | NaN | 86.2 | NaN | -9999.9 | 1.0 | 541.1 | NaN |
| 3107 | Türkiye | AFYON | 1961-1990 | Mean Monthly Value | mm | 37.3 | NaN | 34.9 | NaN | 39.8 | ... | 37.2 | NaN | 30.8 | NaN | 46.2 | NaN | -9999.9 | 1.0 | 398.6 | NaN |
| 3108 | Türkiye | KAYSERI | 1961-1990 | Mean Monthly Value | mm | 32.7 | NaN | 31.2 | NaN | 40.2 | ... | 28.2 | NaN | 36.0 | NaN | 39.3 | NaN | -9999.9 | 1.0 | 386.7 | NaN |
| 3109 | Türkiye | MALATYA | 1961-1990 | Mean Monthly Value | mm | 41.9 | NaN | 35.6 | NaN | 59.5 | ... | 40.4 | NaN | 47.0 | NaN | 42.3 | NaN | -9999.9 | 1.0 | 410.9 | NaN |
| 3110 | Türkiye | SIIRT | 1961-1990 | Mean Monthly Value | mm | 89.5 | NaN | 96.7 | NaN | 102.0 | ... | 52.7 | NaN | 81.5 | NaN | 96.9 | NaN | -9999.9 | 1.0 | 710.4 | NaN |
| 3111 | Türkiye | IZMIR | 1961-1990 | Mean Monthly Value | mm | 132.3 | NaN | 99.1 | NaN | 76.4 | ... | 40.3 | NaN | 87.1 | NaN | 153.0 | NaN | -9999.9 | 1.0 | 692.0 | NaN |
| 3112 | Türkiye | ISPARTA | 1961-1990 | Mean Monthly Value | mm | 78.7 | NaN | 74.3 | NaN | 54.6 | ... | 36.4 | NaN | 46.3 | NaN | 90.1 | NaN | -9999.9 | 1.0 | 546.2 | NaN |
| 3113 | Türkiye | KONYA | 1961-1990 | Mean Monthly Value | mm | 38.8 | NaN | 32.0 | NaN | 28.8 | ... | 31.7 | NaN | 32.5 | NaN | 44.1 | NaN | -9999.9 | 1.0 | 338.2 | NaN |
| 3114 | Türkiye | NIGDE | 1961-1990 | Mean Monthly Value | mm | 30.5 | NaN | 31.3 | NaN | 33.1 | ... | 23.7 | NaN | 30.0 | NaN | 40.1 | NaN | -9999.9 | 1.0 | 323.6 | NaN |
| 3115 | Türkiye | KAHRAMANMARAS | 1961-1990 | Mean Monthly Value | mm | 125.1 | NaN | 110.9 | NaN | 96.6 | ... | 48.7 | NaN | 77.0 | NaN | 132.0 | NaN | -9999.9 | 1.0 | 718.6 | NaN |
| 3116 | Türkiye | URFA | 1961-1990 | Mean Monthly Value | mm | 84.6 | NaN | 73.5 | NaN | 71.1 | ... | 32.5 | NaN | 42.6 | NaN | 87.0 | NaN | -9999.9 | 1.0 | 482.1 | NaN |
| 3117 | Türkiye | DIYARBAKIR | 1961-1990 | Mean Monthly Value | mm | 70.3 | NaN | 69.7 | NaN | 70.9 | ... | 34.0 | NaN | 52.2 | NaN | 75.9 | NaN | -9999.9 | 1.0 | 498.4 | NaN |
| 3118 | Türkiye | HAKKARI | 1961-1990 | Mean Monthly Value | mm | 89.3 | NaN | 105.2 | NaN | 131.1 | ... | 70.4 | NaN | 96.7 | NaN | 93.4 | NaN | -9999.9 | 1.0 | 818.4 | NaN |
| 3119 | Türkiye | MUGLA | 1961-1990 | Mean Monthly Value | mm | 243.9 | NaN | 191.9 | NaN | 121.3 | ... | 71.9 | NaN | 135.3 | NaN | 281.0 | NaN | -9999.9 | 1.0 | 1222.1 | NaN |
| 3120 | Türkiye | ANTALYA | 1961-1990 | Mean Monthly Value | mm | 237.9 | NaN | 191.0 | NaN | 101.5 | ... | 70.4 | NaN | 149.7 | NaN | 222.7 | NaN | -9999.9 | 1.0 | 1077.5 | NaN |
| 3121 | Türkiye | MERSIN | 1961-1990 | Mean Monthly Value | mm | 120.5 | NaN | 110.4 | NaN | 57.9 | ... | 40.2 | NaN | 75.7 | NaN | 135.6 | NaN | -9999.9 | 1.0 | 633.8 | NaN |
| 3122 | Türkiye | ADANA | 1961-1990 | Mean Monthly Value | mm | 117.7 | NaN | 91.8 | NaN | 68.1 | ... | 46.6 | NaN | 74.6 | NaN | 138.3 | NaN | -9999.9 | 1.0 | 698.1 | NaN |
35 rows × 33 columns
Ahora, sí se procede a realizar el merge entre el datafram anterior y el de precipitacion:
co2_solar_prep = co2_solar.merge(precipitation_new,how='inner',left_on='Country',right_on='Country')
co2_solar_prep
| Country | Year_x | Value | Year_y | Unit_x | Quantity | Quantity Footnotes | Station Name | Period | Statistic Description | ... | Oct | Oct Footnotes | Nov | Nov Footnotes | Dec | Dec Footnotes | Annual | Annual Footnotes | Annual NCDC Computed Value | Annual NCDC Computed Value Footnotes | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Australia | 1990 | 2.781542e+05 | 2020 | Kilowatt-hours, million | 21033.194 | NaN | ALBANY (ALBANY TOWN) | 1961-1965 | Mean Monthly Value | ... | 96.0 | NaN | 37.9 | NaN | 23.0 | NaN | -9999.9 | 1.0 | 1012.3 | NaN |
| 1 | Australia | 1990 | 2.781542e+05 | 2020 | Kilowatt-hours, million | 21033.194 | NaN | ESPERANCE (ESPERANCE POST OFFICE) | 1961-1969 | Mean Monthly Value | ... | 51.3 | NaN | 19.3 | NaN | 17.9 | NaN | -9999.9 | 1.0 | 698.8 | NaN |
| 2 | Australia | 1990 | 2.781542e+05 | 2020 | Kilowatt-hours, million | 21033.194 | NaN | TENNANT CRK POST OFFICE WAS 014050 | 1961-1970 | Mean Monthly Value | ... | 14.3 | NaN | 11.8 | NaN | 40.6 | NaN | -9999.9 | 1.0 | 321.3 | NaN |
| 3 | Australia | 1990 | 2.781542e+05 | 2020 | Kilowatt-hours, million | 21033.194 | NaN | LONGREACH POST OFFICE | 1961-1973 | Mean Monthly Value | ... | 19.6 | NaN | 22.3 | NaN | 84.1 | NaN | -9999.9 | 1.0 | 433.7 | NaN |
| 4 | Australia | 1990 | 2.781542e+05 | 2020 | Kilowatt-hours, million | 21033.194 | NaN | ESSENDON AIRPORT AMO | 1961-1972 | Mean Monthly Value | ... | 51.5 | NaN | 42.7 | NaN | 57.5 | NaN | -9999.9 | 1.0 | 594.9 | NaN |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2570 | United States | 1990 | 5.122496e+06 | 2020 | Kilowatt-hours, million | 119328.543 | NaN | POHNPEI, PI | 1961-1990 | Mean Monthly Value | ... | 424.4 | NaN | 399.8 | NaN | 386.6 | NaN | 4769.1 | NaN | 4769.1 | NaN |
| 2571 | United States | 1990 | 5.122496e+06 | 2020 | Kilowatt-hours, million | 119328.543 | NaN | KWAJALEIN ISLAND, PI | 1961-1990 | Mean Monthly Value | ... | 302.5 | NaN | 270.8 | NaN | 205.7 | NaN | 2593.0 | NaN | 2593.0 | NaN |
| 2572 | United States | 1990 | 5.122496e+06 | 2020 | Kilowatt-hours, million | 119328.543 | NaN | MAJURO, PI | 1961-1990 | Mean Monthly Value | ... | 351.5 | NaN | 325.1 | NaN | 301.0 | NaN | 3336.0 | NaN | 3336.0 | NaN |
| 2573 | United States | 1990 | 5.122496e+06 | 2020 | Kilowatt-hours, million | 119328.543 | NaN | KOROR/W CAROLINE ISLANDS PI | 1961-1990 | Mean Monthly Value | ... | 352.3 | NaN | 287.5 | NaN | 304.3 | NaN | 3758.3 | NaN | 3758.3 | NaN |
| 2574 | United States | 1990 | 5.122496e+06 | 2020 | Kilowatt-hours, million | 119328.543 | NaN | YAP ISLAND/CAROLINE ISLANDS PI | 1961-1990 | Mean Monthly Value | ... | 304.0 | NaN | 230.4 | NaN | 228.3 | NaN | 3049.5 | NaN | 3049.5 | NaN |
2575 rows × 39 columns
A todo este resultado le asigno un nuevo nombre y es el que será modificado.
pd.options.mode.chained_assignment = None
DF = co2_solar_prep.copy()
DF.head()
| Country | Year_x | Value | Year_y | Unit_x | Quantity | Quantity Footnotes | Station Name | Period | Statistic Description | ... | Oct | Oct Footnotes | Nov | Nov Footnotes | Dec | Dec Footnotes | Annual | Annual Footnotes | Annual NCDC Computed Value | Annual NCDC Computed Value Footnotes | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Australia | 1990 | 278154.156296 | 2020 | Kilowatt-hours, million | 21033.194 | NaN | ALBANY (ALBANY TOWN) | 1961-1965 | Mean Monthly Value | ... | 96.0 | NaN | 37.9 | NaN | 23.0 | NaN | -9999.9 | 1.0 | 1012.3 | NaN |
| 1 | Australia | 1990 | 278154.156296 | 2020 | Kilowatt-hours, million | 21033.194 | NaN | ESPERANCE (ESPERANCE POST OFFICE) | 1961-1969 | Mean Monthly Value | ... | 51.3 | NaN | 19.3 | NaN | 17.9 | NaN | -9999.9 | 1.0 | 698.8 | NaN |
| 2 | Australia | 1990 | 278154.156296 | 2020 | Kilowatt-hours, million | 21033.194 | NaN | TENNANT CRK POST OFFICE WAS 014050 | 1961-1970 | Mean Monthly Value | ... | 14.3 | NaN | 11.8 | NaN | 40.6 | NaN | -9999.9 | 1.0 | 321.3 | NaN |
| 3 | Australia | 1990 | 278154.156296 | 2020 | Kilowatt-hours, million | 21033.194 | NaN | LONGREACH POST OFFICE | 1961-1973 | Mean Monthly Value | ... | 19.6 | NaN | 22.3 | NaN | 84.1 | NaN | -9999.9 | 1.0 | 433.7 | NaN |
| 4 | Australia | 1990 | 278154.156296 | 2020 | Kilowatt-hours, million | 21033.194 | NaN | ESSENDON AIRPORT AMO | 1961-1972 | Mean Monthly Value | ... | 51.5 | NaN | 42.7 | NaN | 57.5 | NaN | -9999.9 | 1.0 | 594.9 | NaN |
5 rows × 39 columns
co2_solar.shape
(40, 7)
Notamos que con el último merge se perdió solo un dato
co2_solar_prep.groupby('Country')[['Annual NCDC Computed Value']].agg('mean').shape
(39, 1)
El archivo adjuntado en Paideia PUCP fue subido al repositorio de Github
country_isosLink = 'https://github.com/IntroSpatialDataScience/ClimateChangeData/raw/main/data/country_isos.xlsx'
country_Isos = pd.read_excel(country_isosLink)
country_Isos
| Country | Officialstatename | InternetccTLD | iso2 | iso3 | |
|---|---|---|---|---|---|
| 0 | AFGHANISTAN | The Islamic Republic of Afghanistan | .af | AF | AFG |
| 1 | ALBANIA | The Republic of Albania | .al | AL | ALB |
| 2 | ALGERIA | The People's Democratic Republic of Algeria | .dz | DZ | DZA |
| 3 | ANGOLA | The Republic of Angola | .ao | AO | AGO |
| 4 | ANTIGUA AND BARBUDA | Antigua and Barbuda | .ag | AG | ATG |
| ... | ... | ... | ... | ... | ... |
| 166 | UZBEKISTAN | The Republic of Uzbekistan | .uz | UZ | UZB |
| 167 | VIET NAM | The Socialist Republic of Viet Nam | .vn | VN | VNM |
| 168 | YEMEN | The Republic of Yemen | .ye | YE | YEM |
| 169 | ZAMBIA | The Republic of Zambia | .zm | ZM | ZMB |
| 170 | ZIMBABWE | The Republic of Zimbabwe | .zw | ZW | ZWE |
171 rows × 5 columns
country_Isos.Country=country_Isos.Country.str.title()
OnlyDF=set(DF.Country)-set(country_Isos.Country)
OnlyDF
{'Netherlands',
'Russian Federation',
'Türkiye',
'United Kingdom',
'United States'}
OnlyIsos=set(country_Isos.Country)-set(DF.Country)
OnlyIsos
{'Afghanistan',
'Albania',
'Algeria',
'Angola',
'Antigua And Barbuda',
'Argentina',
'Armenia',
'Azerbaijan',
'Bahamas (The)',
'Bahrain',
'Bangladesh',
'Barbados',
'Belize',
'Benin',
'Bhutan',
'Bolivia (Plurinational State Of)',
'Bosnia And Herzegovina',
'Botswana',
'Brazil',
'Brunei Darussalam',
'Burkina Faso',
'Burundi',
'Cabo Verde',
'Cambodia',
'Cameroon',
'Central African Republic (The)',
'Chad',
'Chile',
'China',
'Colombia',
'Comoros (The)',
'Congo (The Democratic Republic Of The)',
'Congo (The)',
'Costa Rica',
'Cuba',
"C√Îte D'Ivoire",
'Djibouti',
'Dominican Republic (The)',
'Ecuador',
'Egypt',
'El Salvador',
'Equatorial Guinea',
'Eswatini',
'Ethiopia',
'Fiji',
'Gabon',
'Gambia (The)',
'Georgia',
'Ghana',
'Grenada',
'Guatemala',
'Guinea',
'Guinea-Bissau',
'Guyana',
'Haiti',
'Honduras',
'Iceland',
'India',
'Indonesia',
'Iran (Islamic Republic Of)',
'Iraq',
'Jamaica',
'Jordan',
'Kazakhstan',
'Kenya',
'Korea (The Republic Of)',
'Kuwait',
'Kyrgyzstan',
"Lao People'S Democratic Republic (The)",
'Lebanon',
'Lesotho',
'Liberia',
'Libya',
'Madagascar',
'Malawi',
'Malaysia',
'Maldives',
'Mali',
'Mauritania',
'Mauritius',
'Mexico',
'Micronesia (Federated States Of)',
'Moldova (The Republic Of)',
'Mongolia',
'Montenegro',
'Morocco',
'Mozambique',
'Myanmar',
'Namibia',
'Nepal',
'Netherlands (The)',
'Nicaragua',
'Niger (The)',
'Nigeria',
'North Macedonia',
'Oman',
'Pakistan',
'Panama',
'Papua New Guinea',
'Paraguay',
'Peru',
'Philippines (The)',
'Qatar',
'Russian Federation (The)',
'Rwanda',
'Samoa',
'Sao Tome And Principe',
'Saudi Arabia',
'Senegal',
'Serbia',
'Seychelles',
'Sierra Leone',
'Singapore',
'Solomon Islands',
'South Sudan',
'Sri Lanka',
'Sudan (The)',
'Suriname',
'Tajikistan',
'Tanzania, The United Republic Of',
'Thailand',
'Timor-Leste',
'Togo',
'Trinidad And Tobago',
'Tunisia',
'Turkey',
'Turkmenistan',
'Uganda',
'United Arab Emirates (The)',
'United Kingdom Of Great Britain And Northern Ireland (The)',
'United States Minor Outlying Islands (The)',
'Uruguay',
'Uzbekistan',
'Viet Nam',
'Yemen',
'Zambia',
'Zimbabwe'}
[(f,fz.extractOne(f, OnlyIsos)) for f in sorted(OnlyDF)]
[('Netherlands', ('Netherlands (The)', 95)),
('Russian Federation', ('Russian Federation (The)', 95)),
('Türkiye', ('Turkey', 62)),
('United Kingdom',
('United Kingdom Of Great Britain And Northern Ireland (The)', 90)),
('United States', ('United States Minor Outlying Islands (The)', 90))]
match_DF_Isos = [(f,fz.extractOne(f, OnlyIsos)) for f in sorted(OnlyDF)
if fz.extractOne(f, OnlyIsos)[1]>=60]
match_DF_Isos= {f:fz.extractOne(f, OnlyIsos)[0]
for f in sorted(OnlyDF)
if fz.extractOne(f, OnlyIsos)[1] >=60}
match_DF_Isos
{'Netherlands': 'Netherlands (The)',
'Russian Federation': 'Russian Federation (The)',
'Türkiye': 'Turkey',
'United Kingdom': 'United Kingdom Of Great Britain And Northern Ireland (The)',
'United States': 'United States Minor Outlying Islands (The)'}
#Invierto el dict para reemplazar los valores en el dataframe countryIsos
match_DF_Isos = dict(zip(match_DF_Isos.values(), match_DF_Isos.keys()))
match_DF_Isos
{'Netherlands (The)': 'Netherlands',
'Russian Federation (The)': 'Russian Federation',
'Turkey': 'Türkiye',
'United Kingdom Of Great Britain And Northern Ireland (The)': 'United Kingdom',
'United States Minor Outlying Islands (The)': 'United States'}
Realizo el reemplazo de los nuevos nombres
country_Isos.Country.replace(to_replace=match_DF_Isos,inplace=True)
country_Isos
| Country | Officialstatename | InternetccTLD | iso2 | iso3 | |
|---|---|---|---|---|---|
| 0 | Afghanistan | The Islamic Republic of Afghanistan | .af | AF | AFG |
| 1 | Albania | The Republic of Albania | .al | AL | ALB |
| 2 | Algeria | The People's Democratic Republic of Algeria | .dz | DZ | DZA |
| 3 | Angola | The Republic of Angola | .ao | AO | AGO |
| 4 | Antigua And Barbuda | Antigua and Barbuda | .ag | AG | ATG |
| ... | ... | ... | ... | ... | ... |
| 166 | Uzbekistan | The Republic of Uzbekistan | .uz | UZ | UZB |
| 167 | Viet Nam | The Socialist Republic of Viet Nam | .vn | VN | VNM |
| 168 | Yemen | The Republic of Yemen | .ye | YE | YEM |
| 169 | Zambia | The Republic of Zambia | .zm | ZM | ZMB |
| 170 | Zimbabwe | The Republic of Zimbabwe | .zw | ZW | ZWE |
171 rows × 5 columns
Realizo el merge entre mi Dataframe y el nuevo DF manteniendo lo común entre ambos.
DF = DF.merge(country_Isos,how='inner',left_on='Country',right_on='Country')
DF
| Country | Year_x | Value | Year_y | Unit_x | Quantity | Quantity Footnotes | Station Name | Period | Statistic Description | ... | Dec | Dec Footnotes | Annual | Annual Footnotes | Annual NCDC Computed Value | Annual NCDC Computed Value Footnotes | Officialstatename | InternetccTLD | iso2 | iso3 | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Australia | 1990 | 2.781542e+05 | 2020 | Kilowatt-hours, million | 21033.194 | NaN | ALBANY (ALBANY TOWN) | 1961-1965 | Mean Monthly Value | ... | 23.0 | NaN | -9999.9 | 1.0 | 1012.3 | NaN | The Commonwealth of Australia | .au | AU | AUS |
| 1 | Australia | 1990 | 2.781542e+05 | 2020 | Kilowatt-hours, million | 21033.194 | NaN | ESPERANCE (ESPERANCE POST OFFICE) | 1961-1969 | Mean Monthly Value | ... | 17.9 | NaN | -9999.9 | 1.0 | 698.8 | NaN | The Commonwealth of Australia | .au | AU | AUS |
| 2 | Australia | 1990 | 2.781542e+05 | 2020 | Kilowatt-hours, million | 21033.194 | NaN | TENNANT CRK POST OFFICE WAS 014050 | 1961-1970 | Mean Monthly Value | ... | 40.6 | NaN | -9999.9 | 1.0 | 321.3 | NaN | The Commonwealth of Australia | .au | AU | AUS |
| 3 | Australia | 1990 | 2.781542e+05 | 2020 | Kilowatt-hours, million | 21033.194 | NaN | LONGREACH POST OFFICE | 1961-1973 | Mean Monthly Value | ... | 84.1 | NaN | -9999.9 | 1.0 | 433.7 | NaN | The Commonwealth of Australia | .au | AU | AUS |
| 4 | Australia | 1990 | 2.781542e+05 | 2020 | Kilowatt-hours, million | 21033.194 | NaN | ESSENDON AIRPORT AMO | 1961-1972 | Mean Monthly Value | ... | 57.5 | NaN | -9999.9 | 1.0 | 594.9 | NaN | The Commonwealth of Australia | .au | AU | AUS |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2570 | United States | 1990 | 5.122496e+06 | 2020 | Kilowatt-hours, million | 119328.543 | NaN | POHNPEI, PI | 1961-1990 | Mean Monthly Value | ... | 386.6 | NaN | 4769.1 | NaN | 4769.1 | NaN | Baker Island, Howland Island, Jarvis Island... | NaN | UM | UMI |
| 2571 | United States | 1990 | 5.122496e+06 | 2020 | Kilowatt-hours, million | 119328.543 | NaN | KWAJALEIN ISLAND, PI | 1961-1990 | Mean Monthly Value | ... | 205.7 | NaN | 2593.0 | NaN | 2593.0 | NaN | Baker Island, Howland Island, Jarvis Island... | NaN | UM | UMI |
| 2572 | United States | 1990 | 5.122496e+06 | 2020 | Kilowatt-hours, million | 119328.543 | NaN | MAJURO, PI | 1961-1990 | Mean Monthly Value | ... | 301.0 | NaN | 3336.0 | NaN | 3336.0 | NaN | Baker Island, Howland Island, Jarvis Island... | NaN | UM | UMI |
| 2573 | United States | 1990 | 5.122496e+06 | 2020 | Kilowatt-hours, million | 119328.543 | NaN | KOROR/W CAROLINE ISLANDS PI | 1961-1990 | Mean Monthly Value | ... | 304.3 | NaN | 3758.3 | NaN | 3758.3 | NaN | Baker Island, Howland Island, Jarvis Island... | NaN | UM | UMI |
| 2574 | United States | 1990 | 5.122496e+06 | 2020 | Kilowatt-hours, million | 119328.543 | NaN | YAP ISLAND/CAROLINE ISLANDS PI | 1961-1990 | Mean Monthly Value | ... | 228.3 | NaN | 3049.5 | NaN | 3049.5 | NaN | Baker Island, Howland Island, Jarvis Island... | NaN | UM | UMI |
2575 rows × 43 columns
co2_solar_prep.groupby('Country')[['Annual NCDC Computed Value']].agg('mean').shape
(39, 1)
Vemos que se ha mantenido el número de datos
DF.groupby('Country')[['Annual NCDC Computed Value']].agg('mean').shape
(39, 1)
Mantengo solo las columnas de mi interés
DF = DF[['Country','Value','Quantity','Station Name','Annual NCDC Computed Value','Officialstatename','InternetccTLD','iso2','iso3']]
DF
| Country | Value | Quantity | Station Name | Annual NCDC Computed Value | Officialstatename | InternetccTLD | iso2 | iso3 | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | Australia | 2.781542e+05 | 21033.194 | ALBANY (ALBANY TOWN) | 1012.3 | The Commonwealth of Australia | .au | AU | AUS |
| 1 | Australia | 2.781542e+05 | 21033.194 | ESPERANCE (ESPERANCE POST OFFICE) | 698.8 | The Commonwealth of Australia | .au | AU | AUS |
| 2 | Australia | 2.781542e+05 | 21033.194 | TENNANT CRK POST OFFICE WAS 014050 | 321.3 | The Commonwealth of Australia | .au | AU | AUS |
| 3 | Australia | 2.781542e+05 | 21033.194 | LONGREACH POST OFFICE | 433.7 | The Commonwealth of Australia | .au | AU | AUS |
| 4 | Australia | 2.781542e+05 | 21033.194 | ESSENDON AIRPORT AMO | 594.9 | The Commonwealth of Australia | .au | AU | AUS |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2570 | United States | 5.122496e+06 | 119328.543 | POHNPEI, PI | 4769.1 | Baker Island, Howland Island, Jarvis Island... | NaN | UM | UMI |
| 2571 | United States | 5.122496e+06 | 119328.543 | KWAJALEIN ISLAND, PI | 2593.0 | Baker Island, Howland Island, Jarvis Island... | NaN | UM | UMI |
| 2572 | United States | 5.122496e+06 | 119328.543 | MAJURO, PI | 3336.0 | Baker Island, Howland Island, Jarvis Island... | NaN | UM | UMI |
| 2573 | United States | 5.122496e+06 | 119328.543 | KOROR/W CAROLINE ISLANDS PI | 3758.3 | Baker Island, Howland Island, Jarvis Island... | NaN | UM | UMI |
| 2574 | United States | 5.122496e+06 | 119328.543 | YAP ISLAND/CAROLINE ISLANDS PI | 3049.5 | Baker Island, Howland Island, Jarvis Island... | NaN | UM | UMI |
2575 rows × 9 columns
pd.options.mode.chained_assignment = None #desactivar mensaje de emergencia
DF.rename(columns={'Value':'CO2_anual','Quantity':'solar_energy_anual','Annual NCDC Computed Value':'precipitation_anual'}, inplace=True)
DF
| Country | CO2_anual | solar_energy_anual | Station Name | precipitation_anual | Officialstatename | InternetccTLD | iso2 | iso3 | |
|---|---|---|---|---|---|---|---|---|---|
| 0 | Australia | 2.781542e+05 | 21033.194 | ALBANY (ALBANY TOWN) | 1012.3 | The Commonwealth of Australia | .au | AU | AUS |
| 1 | Australia | 2.781542e+05 | 21033.194 | ESPERANCE (ESPERANCE POST OFFICE) | 698.8 | The Commonwealth of Australia | .au | AU | AUS |
| 2 | Australia | 2.781542e+05 | 21033.194 | TENNANT CRK POST OFFICE WAS 014050 | 321.3 | The Commonwealth of Australia | .au | AU | AUS |
| 3 | Australia | 2.781542e+05 | 21033.194 | LONGREACH POST OFFICE | 433.7 | The Commonwealth of Australia | .au | AU | AUS |
| 4 | Australia | 2.781542e+05 | 21033.194 | ESSENDON AIRPORT AMO | 594.9 | The Commonwealth of Australia | .au | AU | AUS |
| ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2570 | United States | 5.122496e+06 | 119328.543 | POHNPEI, PI | 4769.1 | Baker Island, Howland Island, Jarvis Island... | NaN | UM | UMI |
| 2571 | United States | 5.122496e+06 | 119328.543 | KWAJALEIN ISLAND, PI | 2593.0 | Baker Island, Howland Island, Jarvis Island... | NaN | UM | UMI |
| 2572 | United States | 5.122496e+06 | 119328.543 | MAJURO, PI | 3336.0 | Baker Island, Howland Island, Jarvis Island... | NaN | UM | UMI |
| 2573 | United States | 5.122496e+06 | 119328.543 | KOROR/W CAROLINE ISLANDS PI | 3758.3 | Baker Island, Howland Island, Jarvis Island... | NaN | UM | UMI |
| 2574 | United States | 5.122496e+06 | 119328.543 | YAP ISLAND/CAROLINE ISLANDS PI | 3049.5 | Baker Island, Howland Island, Jarvis Island... | NaN | UM | UMI |
2575 rows × 9 columns
Creo un nuevo DF con el promedio de precipitacion anual en todo el país
DF.info()
<class 'pandas.core.frame.DataFrame'> RangeIndex: 2575 entries, 0 to 2574 Data columns (total 9 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Country 2575 non-null object 1 CO2_anual 2575 non-null float64 2 solar_energy_anual 2575 non-null float64 3 Station Name 2575 non-null object 4 precipitation_anual 2575 non-null float64 5 Officialstatename 2575 non-null object 6 InternetccTLD 1796 non-null object 7 iso2 2575 non-null object 8 iso3 2575 non-null object dtypes: float64(3), object(6) memory usage: 181.2+ KB
DF.shape
(2575, 9)
DF = DF.loc[DF.precipitation_anual > 0]
len(DF)
2555
max_prep = DF.groupby('Station Name')[['precipitation_anual']].agg(['max'])
max_prep.columns
MultiIndex([('precipitation_anual', 'max')],
)
max_prep[('precipitation_anual', 'max')].max()
8888888.0
DF = DF.loc[DF.precipitation_anual < 10000]
len(DF)
2554
prep_mean = DF.groupby('Country')[['precipitation_anual']].agg('mean')
prep_mean
| precipitation_anual | |
|---|---|
| Country | |
| Australia | 764.802985 |
| Austria | 1105.285714 |
| Belarus | 622.833333 |
| Belgium | 821.200000 |
| Bulgaria | 528.714286 |
| Canada | 781.171179 |
| Croatia | 853.550000 |
| Cyprus | 320.000000 |
| Czechia | 569.450000 |
| Denmark | 727.277778 |
| Estonia | 612.333333 |
| Finland | 575.916667 |
| France | 736.062500 |
| Germany | 826.368421 |
| Greece | 624.800000 |
| Hungary | 560.571429 |
| Ireland | 1006.818182 |
| Italy | 778.709091 |
| Japan | 1731.690323 |
| Latvia | 666.000000 |
| Lithuania | 682.214286 |
| Luxembourg | 875.600000 |
| Malta | 553.300000 |
| Netherlands | 772.160000 |
| New Zealand | 1291.799094 |
| Norway | 860.461538 |
| Poland | 693.500000 |
| Portugal | 907.333333 |
| Romania | 583.454545 |
| Russian Federation | 497.791379 |
| Slovakia | 595.733333 |
| Slovenia | 1393.100000 |
| Spain | 626.490000 |
| Sweden | 589.812500 |
| Switzerland | 1628.000000 |
| Türkiye | 651.342857 |
| Ukraine | 568.000000 |
| United Kingdom | 857.700000 |
| United States | 940.355913 |
len(prep_mean)
39
Elimino las siguientes columnas para luego realizar un merge con prep_mean
DF.drop(columns=['Station Name','precipitation_anual'],inplace=True)
Realizo un merge entre mi DF y prep_mean. Mantengo los elemento en común de la columna Country
DF = prep_mean.merge(DF,how='inner',left_on='Country',right_on='Country')
DF
| Country | precipitation_anual | CO2_anual | solar_energy_anual | Officialstatename | InternetccTLD | iso2 | iso3 | |
|---|---|---|---|---|---|---|---|---|
| 0 | Australia | 764.802985 | 2.781542e+05 | 21033.194 | The Commonwealth of Australia | .au | AU | AUS |
| 1 | Australia | 764.802985 | 2.781542e+05 | 21033.194 | The Commonwealth of Australia | .au | AU | AUS |
| 2 | Australia | 764.802985 | 2.781542e+05 | 21033.194 | The Commonwealth of Australia | .au | AU | AUS |
| 3 | Australia | 764.802985 | 2.781542e+05 | 21033.194 | The Commonwealth of Australia | .au | AU | AUS |
| 4 | Australia | 764.802985 | 2.781542e+05 | 21033.194 | The Commonwealth of Australia | .au | AU | AUS |
| ... | ... | ... | ... | ... | ... | ... | ... | ... |
| 2549 | United States | 940.355913 | 5.122496e+06 | 119328.543 | Baker Island, Howland Island, Jarvis Island... | NaN | UM | UMI |
| 2550 | United States | 940.355913 | 5.122496e+06 | 119328.543 | Baker Island, Howland Island, Jarvis Island... | NaN | UM | UMI |
| 2551 | United States | 940.355913 | 5.122496e+06 | 119328.543 | Baker Island, Howland Island, Jarvis Island... | NaN | UM | UMI |
| 2552 | United States | 940.355913 | 5.122496e+06 | 119328.543 | Baker Island, Howland Island, Jarvis Island... | NaN | UM | UMI |
| 2553 | United States | 940.355913 | 5.122496e+06 | 119328.543 | Baker Island, Howland Island, Jarvis Island... | NaN | UM | UMI |
2554 rows × 8 columns
DF = DF.drop_duplicates()
DF
| Country | precipitation_anual | CO2_anual | solar_energy_anual | Officialstatename | InternetccTLD | iso2 | iso3 | |
|---|---|---|---|---|---|---|---|---|
| 0 | Australia | 764.802985 | 2.781542e+05 | 21033.194 | The Commonwealth of Australia | .au | AU | AUS |
| 67 | Austria | 1105.285714 | 6.214525e+04 | 2042.934 | The Republic of Austria | .at | AT | AUT |
| 74 | Belarus | 622.833333 | 1.083451e+05 | 176.000 | The Republic of Belarus | .by | BY | BLR |
| 80 | Belgium | 821.200000 | 1.202926e+05 | 5105.400 | The Kingdom of Belgium | .be | BE | BEL |
| 81 | Bulgaria | 528.714286 | 7.669920e+04 | 1480.856 | The Republic of Bulgaria | .bg | BG | BGR |
| 88 | Canada | 781.171179 | 4.582183e+05 | 4846.000 | Canada | .ca | CA | CAN |
| 317 | Croatia | 853.550000 | 2.297979e+04 | 95.500 | The Republic of Croatia | .hr | HR | HRV |
| 319 | Cyprus | 320.000000 | 4.653217e+03 | 295.608 | The Republic of Cyprus | .cy | CY | CYP |
| 320 | Czechia | 569.450000 | 1.642107e+05 | 2287.014 | The Czech Republic | .cz | CZ | CZE |
| 324 | Denmark | 727.277778 | 5.487931e+04 | 1180.532 | The Kingdom of Denmark | .dk | DK | DNK |
| 360 | Estonia | 612.333333 | 3.692221e+04 | 122.540 | The Republic of Estonia | .ee | EE | EST |
| 363 | Finland | 575.916667 | 5.691434e+04 | 218.471 | The Republic of Finland | .fi | FI | FIN |
| 375 | France | 736.062500 | 4.009646e+05 | 13398.392 | The French Republic | .fr | FR | FRA |
| 391 | Germany | 826.368421 | 1.051979e+06 | 48641.000 | The Federal Republic of Germany | .de | DE | DEU |
| 410 | Greece | 624.800000 | 8.343804e+04 | 4446.854 | The Hellenic Republic | .gr | GR | GRC |
| 432 | Hungary | 560.571429 | 7.322554e+04 | 2459.000 | Hungary | .hu | HU | HUN |
| 439 | Ireland | 1006.818182 | 3.294442e+04 | 63.780 | Ireland | .ie | IE | IRL |
| 450 | Italy | 778.709091 | 4.395498e+05 | 24941.504 | The Italian Republic | .it | IT | ITA |
| 549 | Japan | 1731.690323 | 1.158129e+06 | 79086.858 | Japan | .jp | JP | JPN |
| 704 | Latvia | 666.000000 | 1.966140e+04 | 4.847 | The Republic of Latvia | .lv | LV | LVA |
| 709 | Lithuania | 682.214286 | 3.576773e+04 | 128.800 | The Republic of Lithuania | .lt | LT | LTU |
| 723 | Luxembourg | 875.600000 | 1.182335e+04 | 161.261 | The Grand Duchy of Luxembourg | .lu | LU | LUX |
| 724 | Malta | 553.300000 | 2.394194e+03 | 236.837 | The Republic of Malta | .mt | MT | MLT |
| 725 | Netherlands | 772.160000 | 1.618069e+05 | 8765.172 | The Kingdom of the Netherlands | .nl | NL | NLD |
| 730 | New Zealand | 1291.799094 | 2.550251e+04 | 160.216 | New Zealand | .nz | NZ | NZL |
| 1392 | Norway | 860.461538 | 3.509663e+04 | 26.521 | The Kingdom of Norway | .no | NO | NOR |
| 1405 | Poland | 693.500000 | 3.768136e+05 | 1957.916 | The Republic of Poland | .pl | PL | POL |
| 1423 | Portugal | 907.333333 | 4.532509e+04 | 1715.590 | The Portuguese Republic | .pt | PT | PRT |
| 1435 | Romania | 583.454545 | 1.734639e+05 | 1733.375 | Romania | .ro | RO | ROU |
| 1446 | Russian Federation | 497.791379 | 2.534865e+06 | 2022.400 | The Russian Federation | .ru | RU | RUS |
| 1620 | Slovakia | 595.733333 | 6.147019e+04 | 663.000 | The Slovak Republic | .sk | SK | SVK |
| 1623 | Slovenia | 1393.100000 | 1.509485e+04 | 368.196 | The Republic of Slovenia | .si | SI | SVN |
| 1624 | Spain | 626.490000 | 2.313284e+05 | 20667.000 | The Kingdom of Spain | .es | ES | ESP |
| 1654 | Sweden | 589.812500 | 5.758009e+04 | 1051.000 | The Kingdom of Sweden | .se | SE | SWE |
| 1670 | Switzerland | 1628.000000 | 4.415988e+04 | 2598.659 | The Swiss Confederation | .ch | CH | CHE |
| 1674 | Türkiye | 651.342857 | 1.516645e+05 | 10950.180 | The Republic of Turkey | .tr | TR | TUR |
| 1709 | Ukraine | 568.000000 | 7.058300e+05 | 5969.400 | Ukraine | .ua | UA | UKR |
| 1756 | United Kingdom | 857.700000 | 6.026521e+05 | 13157.994 | The United Kingdom of Great Britain and Northe... | .uk | GB | GBR |
| 1776 | United States | 940.355913 | 5.122496e+06 | 119328.543 | Baker Island, Howland Island, Jarvis Island... | NaN | UM | UMI |
DF.shape
(39, 8)
DF.head()
| Country | precipitation_anual | CO2_anual | solar_energy_anual | Officialstatename | InternetccTLD | iso2 | iso3 | |
|---|---|---|---|---|---|---|---|---|
| 0 | Australia | 764.802985 | 278154.156296 | 21033.194 | The Commonwealth of Australia | .au | AU | AUS |
| 67 | Austria | 1105.285714 | 62145.248288 | 2042.934 | The Republic of Austria | .at | AT | AUT |
| 74 | Belarus | 622.833333 | 108345.055021 | 176.000 | The Republic of Belarus | .by | BY | BLR |
| 80 | Belgium | 821.200000 | 120292.644131 | 5105.400 | The Kingdom of Belgium | .be | BE | BEL |
| 81 | Bulgaria | 528.714286 | 76699.197427 | 1480.856 | The Republic of Bulgaria | .bg | BG | BGR |
DF.info()
<class 'pandas.core.frame.DataFrame'> Index: 39 entries, 0 to 1776 Data columns (total 8 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 Country 39 non-null object 1 precipitation_anual 39 non-null float64 2 CO2_anual 39 non-null float64 3 solar_energy_anual 39 non-null float64 4 Officialstatename 39 non-null object 5 InternetccTLD 38 non-null object 6 iso2 39 non-null object 7 iso3 39 non-null object dtypes: float64(3), object(5) memory usage: 2.7+ KB
import matplotlib.pyplot as plt
DF['precipitation_anual'].plot(kind='box')
plt.title('Variación de la precipitación media anual en diferentes países')
plt.show()
DF[['CO2_anual']].plot(kind='box')
plt.title('Variación de la emisión de CO2 en 1990 por diferentes países')
plt.show()
plt.bar(DF['Country'], DF['solar_energy_anual'])
plt.xticks(rotation=90)
plt.title('Contribución en la producción de energía solar en el 2020 por país')
plt.show()
plt.bar(DF['Country'], DF['CO2_anual'])
plt.xticks(rotation=90)
plt.title('Contribución en la emisión de CO2 en el año 1990 por país')
plt.show()