In this blogpost, we will see some of the Python’s pandas library basic commands and its operations. For running below commands, here, I have used Azure Databricks Notebook with python language. Via magic commands(%python), we can use the same below commands under other language connected notebooks as well.
Basic pandas commands
1. Importing pandas library
we can also import numpy with pandas to utilize much benefits. In below commands, we will see some.
2. To read file and creating a DataFrame
we will use above sample file data for our understandings.
3. To display first 1000 rows
4. To check the type of the DataFrame + type of the column in the DataFrame
5. To check the first/last n entries + default + To check the first/last n entries on the DataFrame Column
6. To check the dimensions of our data
where 1460 is number of rows and 16 is number of columns.
7. To view a summary of the data set/DataFrame
8. To view descriptive statistics about the dataset
describe function works only on numeric kind of data types.
9. To return a Series containing the number of unique values
10. DataFrame Index
11. To check the column names of the dataset/DataFrame + to know row/column count of the dataset/DataFrame
12. To rename multiple columns
13. To create a copy of our DataFrame
14. To add a column with default value
15. To add multiple columns with default values
Here we have used np.nan which means numpy library’s null equivalent.
16. To add n rows in DataFrame
17. To remove the rows + columns
18. To Select data to bring all columns + selected columns + to apply filter condition + bring selected columns
19. To select the data contained in the first row and the first column + entire row + last column + multiple rows and columns combo
20. To Detect missing values as data + count + proportions
21. To Remove missing values
22. To Fill the missing values with default value + median value
Here, instead of median function, we can use other aggregate functions as well. examples are count, min, max, sum etc.,
23. Histograms – to display the distribution of data
24. Scatter plots – to visualize the relationship between two variables
25. To save our DataFrame as a file
We can able to provide our path where we need to save the result set DataFrame.
Thus in this blogpost, we saw above 25 basic pandas library operations and commands. If you thought the above code that may be helpful for you, please take it here(github) and enjoy.
If you really like this blogpost, Please do Like, Share & Follow Blog and Show your Support for many more interesting upcoming Posts!