Pandas is a library in python that is used for data manipulation and data analysis.
You can install pandas in your jupyter terminal by writing:
- conda install pandas
Once you install the library then you can use the import keyword to access all the functions that are used in pandas.
DataFrame: It is just like as your excel spreadsheet which contains rows and columns and DF is the most commmonly used pandas object.
-creating a dataframe
import pandas as pd
student_details = {
'DOB': ['1/1/2017','1/2/2017',
'1/3/2017','1/4/2017','1/5/2017','1/6/2017'],
'MARKS': [32,35,28,24,32,31],
'SPORTS': ['Chess', 'Hockey', 'Football','Basketball','Ludo', 'Pool'],
'NAME':['Tanmay','Sarthak','Sanyam','Sanskar','Deepak','Harsh']
}
print(student_details)
df = pd.DataFrame(student_details)
print(df)
Output:
DOB MARKS SPORTS NAME
0 1/1/2017 32 Chess Tanmay
1 1/2/2017 35 Hockey Sarthak
2 1/3/2017 28 Football Sanyam
3 1/4/2017 24 Basketball Sanskar
4 1/5/2017 32 Ludo Deepak
5 1/6/2017 31 Pool Harsh
1. df.shape # This function will return the dimensions
Output:
(6, 4)
2. df.head(3) # This function will return the top n rows and columns( 5 by default).
Output:
DOB MARKS SPORTS NAME
0 1/1/2017 32 Chess Tanmay
1 1/2/2017 35 Hockey Sarthak
2 1/3/2017 28 Football Sanyam
3.df.tail(2) # This function will return the last n rows and columns(5 by default)
Output:
DOB MARKS SPORTS NAME
4 1/5/2017 32 Ludo Deepak
5 1/6/2017 31 Pool Harsh
4. df.columns:
Output:
Index(['DOB', 'MARKS', 'SPORTS', 'NAME'], dtype='object')
5. df[1:4]
Output:
DOB MARKS SPORTS NAME
0 1/1/2017 32 Chess Tanmay
1 1/2/2017 35 Hockey Sarthak
2 1/3/2017 28 Football Sanyam
6. df['DOB']
Output:
0 1/1/2017
1 1/2/2017
2 1/3/2017
3 1/4/2017
4 1/5/2017
5 1/6/2017
Name: DOB, dtype: object
7. df.dtypes
Output:
DOB object
MARKS int64
SPORTS object
NAME object
dtype: object
8. df['MARKS'].max()
Output:
35
9. df['MARKS'].min()
Output:
24
10. df['MARKS'].mean()
Output:
30.333333333333332
Similarly there are many more functions like describe,loc,index etc. you can work upon different data.