1 / 12

PANDAS

Pandas is one of the tools in Machine Learning which is used for data cleaning and analysis. It has features which are used for exploring, cleaning, transforming and visualizing from data.<br>Learn more about Pandas, Datascience, Artificial Intelligence , Machine Learning and much more with Learnbay.<br>Follow our Social media platforms for more such content: <br>Facebook: https://www.facebook.com/learnbay<br>Instagram: https://www.instagram.com/learnbay_datascience/

Download Presentation

PANDAS

An Image/Link below is provided (as is) to download presentation Download Policy: Content on the Website is provided to you AS IS for your information and personal use and may not be sold / licensed / shared on other websites without getting consent from its author. Content is provided to you AS IS for your information and personal use only. Download presentation by click this link. While downloading, if for some reason you are not able to download a presentation, the publisher may have deleted the file from their server. During download, if you can't get a presentation, the file might be deleted by the publisher.

E N D

Presentation Transcript


  1. PANDAS

  2. Introduction to Pandas • Pandas is an open source Python Library thatutilises its powerful data structures to providehigh performance data manipulation andanalysis tools.  • Developer Wes McKinney started developing pandas in 2008 whenhigh performance, scalable data analysis tools were required.  • The name Pandas is derived from theterm PanelDataMultidimensional DataEconometrics.  • Python had been used mainly for datamunging and planning prior to Pandas. For data processing, it had very little contribution. Thisissue was solved by pandas. UsingPandas, we can perform five typical steps in data processing and analysis, regardless of data sources, loading, planning, manipulating, modelling, and analysing. • Python with Pandas is used in a broad range of areas, includingbanking, economics, mathematics, analytics, etc., including academic and commercial domains.

  3. Features

  4. Applications

  5. Some important functions in Pandas functions use Returns the dtype of the object. Returns a list of the row axis labels Returns the number of dimensions of the underlying data Returns the number of elements in the underlying data. • 1.dtype • 2.axes • 3.ndim • 4. size

  6. 5.values • 6.empty • 7.head() • 8.tail() • Returns the Series as ndarray. • Returns True if series is empty. • Returns the first n rows. • Returns the last n rows.

  7. Pandas Dataframe • A Data frame is a two-dimensional data structure, i.e., data is aligned in a tabular fashion in rows and columns. • Features of DataFrame: 1.Potentially columns are of different types 2.Size – Mutable 3.Labeled axes (rows and columns) 4.Can Perform Arithmetic operations on rows and columns • pandas.DataFrame:pandas.DataFrame( data, index, columns, dtype, copy)

  8. data:datatakes various forms like ndarray, series, map, lists, dict, constants and also another DataFrame. • index:Forthe row labels, the Index to be used for the resulting frame is Optional Default np.arange(n) if no index is passed. • columns:Forcolumn labels, the optional default syntax is - np.arange(n). This is only true if no index is passed. • dtype:Datatype of each column. • copy:Thiscommand (or whatever it is) is used for copying of data, if the default is False.

  9. Examples • Create an empty Dataframe: • import the pandas as pd import pandas as pd df = pd.DataFrame() print df • O/P:Empty DataFrameColumns: [] Index: [] • Create a dataframe using Lists: • import pandas as pd data = [['Alex',10],['Bob',12],['Clarke',13]] df = pd.DataFrame(data,columns=['Name','Age'],dtype=float) print df • O/P: Name Age 0 Alex 10 1 Bob 12 2 Clarke 13

  10. Create a DataFrame from Dict of ndarrays / Lists: • All the ndarrays must be of same length. If index is passed, then the length of the index should equal to the length of the arrays. • If no index is passed, then by default, index will be range(n), where n is the array length. • import pandas as pd data = {'Name':['Tom', 'Jack', 'Steve', 'Ricky'],‘Age':[28,34,29,42]} df = pd.DataFrame(data) print df • O/P: Age Name 0 28 Tom 1 34 Jack 2 29 Steve 3 42 Ricky

  11. Create a DataFrame from Dict of Series • Dictionary of Series can be passed to form a DataFrame. The resultant index is the union of all the series indexes passed. • import pandas as pdd = {'one' : pd.Series([1, 2, 3], index=['a', 'b', 'c']), 'two' : pd.Series([1, 2, 3, 4], index=['a', 'b', 'c', 'd'])} df = pd.DataFrame(d) print df • O/P: one two a 1.0 1 b 2.0 2 c 3.0 3 d NaN 4

More Related