Basic panda functions for working with data
In this article, we will explore the Pandas library and how to use it to handle different types of data that you may encounter during your analysis. By the end of the tutorial, you will be more fluent in using panda functions
pandas.read_csv() It is the best and easiest way to read a file
csv a file. It has a lot of parameters that will satisfy most cases. To read only the columns we need, pass a list of the column names you want
usecols. We can also limit the number of rows by simply passing a number to them
we cSet pandas selected by directly accessing the column, for example:
df[‘City’] Another way is to access the column as a property, but in this case the column name must be subject to the conditions for naming variables (no spaces, starts with a letter, …).
One way to rename columns in pandas
rename() Job. This method is very useful when we need to rename some specific column because we need to specify information only for the columns to be renamed.
Columns can also be renamed by direct assignment of a list containing the new names to
DataFrame The object for which we want to rename columns. The disadvantage of this method is that we need to provide new names for all columns even if we want to rename only some columns.
Drop one or more columns of A
DataFrame It can be achieved in several ways. Most popular in
.drop() method. With it we can drop multiple columns or rows.
To sort the Pandas DataFrame we use
.sort_values() method. It can sort the values in ascending or descending order.
We can sort by multiple criteria by passing the list of columns you want to sort by.
Filtering is a common process in data analysis and Pandas provides a variety of ways to filter data points. Here we used: Boolean operators and Boolean multiple operators. There are a lot of other filtering techniques such as:
To apply filtering by multiple criteria, use “
|‘ instead of ‘
or“.If we have a longer case like this we can use it.”
the series The methods on the index are especially useful for cleaning or transforming Dataframe columns. For more information on how to work with text data in Pandas, please check this article: 🐼 Pandas functions for text data processing
To check your data types, you can use
.dtypes It will return the panda chain of columns associated with it
dtype. The simplest way to convert pandas column of data to a different type is to use it
Series Using a scheme or by
Series of columns. a
groupby The process involves a combination of splitting the object, applying a function, and combining the results. This can be used to aggregate large amounts of data and compute operations on these collections.
Multiple assembly functions can be applied simultaneously.
Missing data is a very big problem in real life scenarios. In Pandas the missing data is represented by two values:
None. Panas has many useful functions for detecting, removing and replacing nulls in Pandas DataFrame:
.isna() used to find
.dropna() used to remove
.fillna() to fill
NaN at a specified value.
It is common in attributes to use an index in a range
len(data) . For specific cases (eg time series data), we need to change the index to something more significant. To set an index, we simply pass the column to
Pandas is built on top of NumPy so it tries to follow its conventions about slicing. While ‘
ilocIt works with numbers, it’s built like a NumPy matrix. This is not the case for
locwhich are divided into other types.
DateTime It is a set of dates and times in “” format.
yyyy-mm-dd HH:MM:SS” where
yyyy-mm-dd It is referred to as the date and
HH:MM:SS It is referred to as time. Get our appointments as
datetime64 Objects will allow us to access a lot of date and time information through
.to_datetime() It will convert the string providing our data to
An important part of data analysis is analysis Duplicate values and remove them. panda
duplicated() A method that helps parse only duplicate values. It returns a boolean string
True Only for unique items.
Pandas.apply Allows users to pass a function and apply it to each single value of the Pandas string.
- We’ve covered 15 practical recipes to get you started with Pandas quickly. All of them are useful and useful in certain situations.
- Pandas is a powerful library for data analysis and manipulation. It provides many functions and methods for dealing with data in the form of a table. As with any other tool, the best way to learn pandas is through practice.
thank you for reading. Please let me know if you have any feedback or suggestions.
The data used in this tutorial