Details in the Book Pandas is a Python library which is use for working with datasets, data analyzing, cleaning, exploring and manipulating data. Pandas allows us to analyze big data and coming to conclusions on statistical theories. Pandas is a fast, powerful, flexible and easy to use open source built on top of the Python programming language. Pandas library provides data structures designed to handle tabular datasets with simplified Python API. The name ‘pandas’ is derived from the term ‘panel data’ which refers to multi-dimensional datasets. Pandas provide efficient data structures like DataFrames and Series, along with functions for data cleaning, transformation and analysis. Pandas can be easily integrated with other Python libraries. Pandas is written in Python, Cython and C. Pandas provides the to_excel() function to export DataFrames to Excel files. Wes Mc Kinney started building this library during the period of his research work from 2007 to 2010. Its original release was on 11th Jan, 2008. In 2009 Pandas became open source. In 2012 first edition of Python for Data Analysis is published. Pandas dataframes are more convenient than their relational SQL counter-parts. In Pandas you can incrementally construct queries as you go forward, which is not possible in SQL. In Pandas, operating on and naming intermediate results is easy; while in SQL it is harder. F you know little bit of Python it will take hardly one month to learn Pandas. Pandas is designed to work with NumPy, NumPy ufunc will work on Pandas series and DatFframs. Highlights of Pandas A fast and efficient DataFrame objectTools for reading and writing data between in-memory data structures and different CSV and text files, MS Excel, SQL databases etc.Intelligent data alignments and integrated handling of missing data.
Chapters Installation of Anaconda Creating a dataframeDataFrame indexingAccessing DataFrameSlicing DataFrameFiltering DataFrameSorting DataFramesCreate Pivot TablePandas SeriesReading csv, txt, and excel filesAnalysing DataFrame to csv fileAnalysing DataFrameCleaning DataPandas Functionsapply info replace astype insert resample at[ isna reset_index clip isnull sample concat iterrows set_index copy loc shape corr mask size cov mean sort_values describe median std diff merge sum drop ndim tail drop_duplicates nlargest to_datetime dropna nsmallest to_excel dtypes nunique to_numeric duplicated plot to_string fillna query transform groupby rank unique head read_csv where iloc rename
Note: In all examples photocopy of coding and output using Jupyter Notebook is provided.