Pandas Operations - Unique, Value Counts, Apply, Drop, Columns, Index, Sort Values, Is Null, Pivot Tables - Python
Pandas Operations - Unique, Nunique, Value Counts, Apply, Drop, Columns, Index, Sort Values, Is Null, Pivot Tables
Run the python code here: https://repl.it/@VinitKhandelwal/pandas-operations
0 1 1111 aaa
1 2 2222 bbb
2 3 3333 ccc
3 4 2222 ddd
LIST OF UNIQUE VALUES IN A COLUMN
[1111 2222 3333]
COUNT OF UNIQUE VALUES IN A COLUMN
3
COUNT OF VALUES IN A COLUMN
2222 2
1111 1
3333 1
Name: col2, dtype: int64
CONDITIONAL SELECTION
col1 col2 col3
3 4 2222 ddd
0 False
1 False
2 True
3 True
Name: col1, dtype: bool
APPLY
col1 col2 col3
0 2 2222 aaaaaa
1 4 4444 bbbbbb
2 6 6666 cccccc
3 8 4444 dddddd
0 3
1 3
2 3
3 3
Name: col3, dtype: int64
APPLY LAMBDA FUNCTION
col1 col2 col3
0 3 3333 aaaaaaaaa
1 6 6666 bbbbbbbbb
2 9 9999 ccccccccc
3 12 6666 ddddddddd
DROP
col2 col3
0 1111 aaa
1 2222 bbb
2 3333 ccc
3 2222 ddd
col2 col3
0 1111 aaa
1 2222 bbb
2 3333 ccc
3 2222 ddd
col2 col3 col1
0 1111 aaa 1
1 2222 bbb 2
2 3333 ccc 3
3 2222 ddd 4
DEATILS OF COLUMNS AND INDICES
Index(['col2', 'col3', 'col1'], dtype='object')
RangeIndex(start=0, stop=4, step=1)
SORT COLUMN
col2 col3 col1
0 1111 aaa 1
1 2222 bbb 2
3 2222 ddd 4
2 3333 ccc 3
IS NULL
col2 col3 col1
0 False False False
1 False False False
2 False False False
3 False False False
PIVOT TABLE
A B C D
0 foo one x 1
1 foo one y 3
2 foo two x 2
3 bar two y 5
4 bar one x 4
5 bar one y 1
C x y
A B
bar one 4.0 1.0
two NaN 5.0
foo one 1.0 3.0
two 2.0 NaN
DEATILS OF COLUMNS AND INDICES
Index(['x', 'y'], dtype='object', name='C')
MultiIndex(levels=[['bar', 'foo'], ['one', 'two']],
labels=[[0, 0, 1, 1], [0, 1, 0, 1]],
names=['A', 'B'])
Run the python code here: https://repl.it/@VinitKhandelwal/pandas-operations
import numpy as np
import pandas as pd
df = pd.DataFrame({'col1':[1,2,3,4],'col2':[1111,2222,3333,2222],'col3':['aaa','bbb','ccc','ddd']})
print(df)
print("LIST OF UNIQUE VALUES IN A COLUMN")
print(df['col2'].unique())
print("COUNT OF UNIQUE VALUES IN A COLUMN")
print(df['col2'].nunique())
print("COUNT OF VALUES IN A COLUMN")
print(df['col2'].value_counts())
print("CONDITIONAL SELECTION")
print(df[(df['col1']>2) & (df['col2']<3333)])
print(df['col1']>2)
print("APPLY")
def times2(x):
return x*2
print(df.apply(times2))
print(df['col3'].apply(len))
print("APPLY LAMBDA FUNCTION")
print(df.apply(lambda x: x*3))
print("DROP")
print(df.drop('col1', axis=1))
df.drop('col1', axis=1, inplace=True)
print(df)
df['col1']=[1,2,3,4]
print(df)
print("DEATILS OF COLUMNS AND INDICES")
print(df.columns)
print(df.index)
print("SORT COLUMN")
print(df.sort_values('col2'))
print("IS NULL")
print(df.isnull())
print("PIVOT TABLE")
df = pd.DataFrame({'A':['foo', 'foo', 'foo', 'bar', 'bar', 'bar'], 'B':['one', 'one', 'two', 'two', 'one', 'one'], 'C':['x', 'y', 'x', 'y', 'x', 'y'], 'D':[1,3,2,5,4,1]})
print(df)
df2 = df.pivot_table(values='D', index=['A','B'], columns=['C'])
print(df2)
print("DEATILS OF COLUMNS AND INDICES")
print(df2.columns)
print(df2.index)
OUTPUT
col1 col2 col30 1 1111 aaa
1 2 2222 bbb
2 3 3333 ccc
3 4 2222 ddd
LIST OF UNIQUE VALUES IN A COLUMN
[1111 2222 3333]
COUNT OF UNIQUE VALUES IN A COLUMN
3
COUNT OF VALUES IN A COLUMN
2222 2
1111 1
3333 1
Name: col2, dtype: int64
CONDITIONAL SELECTION
col1 col2 col3
3 4 2222 ddd
0 False
1 False
2 True
3 True
Name: col1, dtype: bool
APPLY
col1 col2 col3
0 2 2222 aaaaaa
1 4 4444 bbbbbb
2 6 6666 cccccc
3 8 4444 dddddd
0 3
1 3
2 3
3 3
Name: col3, dtype: int64
APPLY LAMBDA FUNCTION
col1 col2 col3
0 3 3333 aaaaaaaaa
1 6 6666 bbbbbbbbb
2 9 9999 ccccccccc
3 12 6666 ddddddddd
DROP
col2 col3
0 1111 aaa
1 2222 bbb
2 3333 ccc
3 2222 ddd
col2 col3
0 1111 aaa
1 2222 bbb
2 3333 ccc
3 2222 ddd
col2 col3 col1
0 1111 aaa 1
1 2222 bbb 2
2 3333 ccc 3
3 2222 ddd 4
DEATILS OF COLUMNS AND INDICES
Index(['col2', 'col3', 'col1'], dtype='object')
RangeIndex(start=0, stop=4, step=1)
SORT COLUMN
col2 col3 col1
0 1111 aaa 1
1 2222 bbb 2
3 2222 ddd 4
2 3333 ccc 3
IS NULL
col2 col3 col1
0 False False False
1 False False False
2 False False False
3 False False False
PIVOT TABLE
A B C D
0 foo one x 1
1 foo one y 3
2 foo two x 2
3 bar two y 5
4 bar one x 4
5 bar one y 1
C x y
A B
bar one 4.0 1.0
two NaN 5.0
foo one 1.0 3.0
two 2.0 NaN
DEATILS OF COLUMNS AND INDICES
Index(['x', 'y'], dtype='object', name='C')
MultiIndex(levels=[['bar', 'foo'], ['one', 'two']],
labels=[[0, 0, 1, 1], [0, 1, 0, 1]],
names=['A', 'B'])
Comments
Post a Comment