一、创建DataFrame
1、创建表格DataFrame
fruits = pd.DataFrame([[30, 21]], columns=['Apples', 'Bananas'])
效果:
2、再建一个不一样的DataFrame
fruit_sales = pd.DataFrame([[35, 21], [41, 34]], columns=['Apples', 'Bananas'],
index=['2017 Sales', '2018 Sales'])
3、Series 两个列单独建
quantities = ['4 cups', '1 cup', '2 large', '1 can']
items = ['Flour', 'Milk', 'Eggs', 'Spam']
recipe = pd.Series(quantities, index=items, name='Dinner')
Flour 4 cups
Milk 1 cup
Eggs 2 large
Spam 1 can
Name: Dinner, dtype: object
二、读操作
1、读一个csv文件
reviews = pd.read_csv('../input/wine-reviews/winemag-data_first150k.csv',index_col=0)
reviews
index_col=0 这个是索引列 如果指定index_col=1那么就会按country索引
下面是没有index_col的情况:
Unnamed: 0 | country | description | designation | points | price | province | region_1 | region_2 | variety | winery | |
---|---|---|---|---|---|---|---|---|---|---|---|
0 | 0 | US | This tremendous 100% varietal wine hails from ... | Martha's Vineyard | 96 | 235.0 | California | Napa Valley | Napa | Cabernet Sauvignon | Heitz |
1 | 1 | Spain | Ripe aromas of fig, blackberry and cassis are ... | Carodorum Selección Especial Reserva | 96 | 110.0 | Northern Spain | Toro | NaN | Tinta de Toro | Bodega Carmen Rodríguez |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
150928 | 150928 | France | A perfect salmon shade, with scents of peaches... | Grand Brut Rosé | 90 | 52.0 | Champagne | Champagne | NaN | Champagne Blend | Gosset |
150929 | 150929 | Italy | More Pinot Grigios should taste like this. A r... | NaN | 90 | 15.0 | Northeastern Italy | Alto Adige | NaN | Pinot Grigio | Alois Lageder |
150930 rows × 11 columns
2、以字典方式创建一个 DataFrame 按列创建
animals = pd.DataFrame({'Cows': [12, 20], 'Goats': [22, 19]}, index=['Year 1', 'Year 2'])
animals
Year 1 | 12 | 22 |
---|---|---|
Year 2 | 20 | 19 |
3、读sql
import sqlite3
conn = sqlite3.connect("../input/pitchfork-data/database.sqlite")
music_reviews = pd.read_sql_query("SELECT * FROM artists", conn)
三、写操作
1、将之前的文件存进硬盘
animals.to_csv("cows_and_goats.csv")
to_excel
to_pickle
...
注明:
以上内容来自kaggle