Basic Features of Pandas – 4 Main Features Used by Data Scientists

Python Pandas is popular for its basic functionality. The pandas library has many essential basic functions and functions to make your daily work easier. It is strongly recommended for beginners to master the basic functions of Pandas.

Basic functions of pandas Before starting Pandas basic functionality, you must learn to import libraries For supporting courses, please click here:

>>>Will numpy import as np
>>>use pandas as pd import

Here we will create 4 main data structures that work in Pandas.

index

>>> dataflair_index = pd. date_range ('1/1/2000' ,period = 8 )

series

>>> dataflair_s1 = pd. series( np.random. randn (5 ),index= [ 'A' ,'B' ,'C' ,'d' ,'E' ] )

data frame

>>> dataflair_df1 = pd. Data Frame( np.random. randn (8 ,3 ),index= dataflair_index,List= [ 'A' ,'B' ,'C' ] )

panel

>>> dataflair_wp1 = pd. panel( np.random. randn (2 ,5 ,4 ),item= [ 'of Item1' ,'item 2' ] ,major_axis = PD. DATE_RANGE ('1/1/2000' ,period= 5 ),minor_axis = [ ' A' ,'B' ,'C' ,'D' ] )

output -

import library in pandas

Before diving into the basic functionality of Pandas, let's discover the file hierarchy in Pandas

Now we can start with the basic functionality of pandas.

1.head() function 2.tail() function 3. Properties 4. Flexible binary operations To see the beginning or end of a long sequence, we can use the head() or tail() functions.

1. head() function

Let's create a sequence with 1000 random values

>>> dataflair = pd. series( np.random. randn (1000 ))
use head()function-

>>> dataflair. head()

output -

panda head function

2. tail() function

Now, we use the tail function and set the number of elements to 3:

>>> dataflair. tail (3)

output -

What makes Python Pandas different from other libraries?

3. Properties For supporting courses, please click here:

Attributes play an important role in the basic functionality of pandas, which helps data scientists analyze, clean and prepare data quickly. Pandas objects have many properties that allow you to access metadata.

shape: gives the axis size

Axis labels:

Series: Index (one axis only) DataFrame: Index (rows) and columns Panels: Major Axis, Minor Axis and Project You can safely assign these properties.

>>> dataflair_df1 [ : 2 ]

output -

This will print the last two values ​​of the DataFrame

>>> dataflair_df1.columns = [ x. down() is used for dataflair_df1.columns X ]
>>> dataflair_df1

output -

Using this function, we change uppercase column names to lowercase. If you have to get the actual data inside a Pandas data structure, then just use the values ​​property.

>>> dataflair_s1.values

output -

enter-

>>> dataflair_df1.values

output -

Panda's top-to-bottom name

>>> dataflair_wp1.values

output -

4. Flexible binary operations For supporting courses, please click here:

In binary operations between pandas data structures, there are two important concerns:

Broadcasting behavior between low-dimensional objects and high-dimensional objects Lost data while computing We will learn how to deal with these two problems independently. They can be processed at the same time.

4.1 Broadcasting Behavior For broadcast behavior, the "Series" input is the primary input. You can use the axis() keyword to match indexes or columns.

>>> dataflair_df = pd. Data Frame({ 'one' : PD series(. np.random randn (3 ),index= [ 'A' ,'B' ,'C' ] ),'2' : PD series(. np.random randn (4 ),index = [ 'a' ,'b' ,'c' ,'d' ] ),'3' : pd series( np.random. 3 ),index = [ 'b' ,'c' ,'d' ] )} )
>>> dataflair_df

output -

Using Axis keyword in pandas

enter-

>>>OK= dataflair_df.iloc [ 1 ]
>>>List= dataflair_df [ 'two' ]
>>> dataflair_df. sub (row,axis = 'columns' )

output -

Pandas is popular in data science but has different applications in other fields.

>>> dataflair_df. sub(column, axis= 'index' )

output -

Column wise indexing in pandas

enter-

>>> dataflair_df. sub (column, axis= 0 )

output -

4.1.1 Multi-index DataFrames level Using series, it is possible to align the levels of a multi-index DataFrame.

>>> dataflair_dfmi = dataflair_df. copy()
>>> dataflair_dfmi.index = pd.MultiIndex. from_tuples ([ (1 ,'a' ),(1 ,'b' ),(1 ,'c' ),(2 ,'a' )] ,name= [ 'first' ,'second' ] ))
>>> dataflair_dfmi. sub(column, axis= 0 ,level= 'Second' )
output-

Pandas multi-index dataframe

In the panel, matching or broadcasting behavior is somewhat difficult. Therefore, the arithmetic method will be used instead, giving you the option to specify the broadcast axis.

>>> major_mean = dataflair_wp1. mean ( axis = 'major' )
>>> major_mean

output -

Pandas multi-index DataFrame with main axis

>>> dataflair_wp1. son( major_mean,axis = 'major' )

output -

Series and Index support the divmod() built-in function. It does both floor division and modulo operations, and returns a 2-tuple of the same type. It returns it to the left.

Do you know the benefits provided by Python Pandas?

for series

>>> dataflair_s = pd. series( NP. Popularity Index (10))
>>> dataflair_s

output -

Example of divmod built-in function in pandas enter-

>>> div,rem = divmod (dataflair_s,3 )#divide by 3 
>>> div
0 0

1 0

2 0

3 1

4 1

5 1

6 2

7 2

8 2

9 3

>>>Rem

Result of pandas Divmod with built-in function

for index

>>> dataflair_idx = pd. index( NP. Popularity Index (10))
>>> dataflair_idx

Pandas Series Index

>>> div,rem = divmod (dataflair_idx,3 )
>>> div
Int64Index([0,0,0,1,1,1,2,2,2,3],dtype ='int64')
>>>Rem

Play in pandas using divmod()

We can also divmod() by element.

div, rem = divmod(dataflair_s, [2, 2, 3, 3, 4, 4, 5, 5, 6, 6]) # first element will be divided by 2, second element by 3, third elements are 3 and so on

>>> DIV,REM = divmod (dataflair_s,[ 2 ,2 ,3 ,3 ,4 ,4 ,5 ,5 ,6 ,6 ] )
>>> div

Example of Divmod function

>>>Rem

4.2 Pandas Missing Values In DataFrame and Series, arithmetic functions give you an option to enter fill_value, which basically substitutes a value when a value is missing in the position. NaN can be treated as 0 when adding two DataFrame objects. However, if the value is missing from both DataFrame s, the result will be NaN. You can still replace it with a different value later using the fillna function.

>>> dataflair_df

Find missing values ​​in pandas For supporting courses, please click here:

>>> dataflair_df2 = pd. Data Frame({ 'one' : PD series(. np.random randn (3 ),index= [ 'A' ,'B' ,'C' ] ),'2' : PD series(. np.random randn (4 ),index = [ 'a' ,'b' ,'c' ,'d' ] ),'3' : pd series( np.random. 3 ),index = [ 'b' ,'c' ,'d' ] )} )
>>> dataflair_df2

Get missing values ​​in pandas

>>> dataflair_df + dataflair_df2

Example of missing values ​​in pandas

input

>>> dataflair_df. add( dataflair_df2,fill_value = 0 )#do with'+'the same operation as the operator 

Enter missing values

Summary All in all, we'd say the basic functionality covers a lot of Pandas, but these are the main ones along with some flexible comparisons and boolean reductions. For supporting courses, please click here:

More articles and information | Click the text below to go directly ↓↓↓ Alibaba Cloud K8s Practical Manual [Alibaba cloud CDN pit row guide] CDN ECS Operation and Maintenance Guide DevOps Practice Manual Hadoop Big Data Handbook Knative Cloud Native Application Development Guide OSS Operation and Maintenance Manual

Tags: Python

Posted by akop on Mon, 23 May 2022 05:19:42 +0300