02 使用 Pandas 实现可视化 Plotting using Pandas
《Python数据分析技术栈》第07章Python数据可视化 02 使用 Pandas 绘制曲线 Plotting using Pandas
The Pandas library uses the Matplotlib library behind the scenes for visualizations, but plotting graphs using Pandas functions is much more intuitive and user-friendly. Pandas requires data to be in the wide or aggregated format.
Pandas 库在幕后使用 Matplotlib 库进行可视化,但使用 Pandas 函数绘制图形更加直观和友好。Pandas 要求数据为宽格式或聚合格式。
The plot function (based on the Matplotlib plot function) used in Pandas allows us to create a wide variety of plots simply by customizing the value of the kind parameter, which specifies the type of plot. This is also an example of polymorphism in objectoriented programming (one of the principles of OOPS, which we studied in Chapter 2), where we are using the same method for doing different things. The kind parameter in the plot method changes with the kind of graph you want to plot.
Pandas 中使用的 plot 函数(基于 Matplotlib plot 函数)允许我们通过自定义指定绘图类型的 kind 参数值来创建各种绘图。这也是面向对象编程中多态性的一个例子(我们在第 2 章中学习过的 OOPS 原则之一),我们使用相同的方法做不同的事情。plot 方法中的 kind 参数会随着要绘制的图形类型而改变。
Let us learn how to create plots in Pandas using the Iris dataset.
让我们学习如何使用 Iris 数据集在 Pandas 中创建图表。
The Iris dataset contains samples from various species of the iris plant. Each sample contains five attributes: sepal length, sepal width, petal length, petal width, and species (Iris setosa, Iris versicolor, and Iris virginica). There are 50 samples of each species. The Iris dataset is inbuilt in the sklearn library and can be imported as follows:
鸢尾花数据集包含来自不同种类鸢尾花的样本。每个样本包含五个属性:萼片长度、萼片宽度、花瓣长度、花瓣宽度和物种(Iris setosa、Iris versicolor 和 Iris virginica)。每个物种有 50 个样本。鸢尾花数据集内置在 sklearn 库中,可按如下方式导入:
import pandas as pd from sklearn.datasets import load_iris data=load_iris() iris=pd.DataFrame(data=data.data,columns=data.feature_names) iris['species']=pd.Series(data['target']).map({
0:'setosa',1:'versicolor',2:'virginica'})
散点图 Scatter plot
A scatter plot helps us understand if there is a linear relationship between two variables. To generate a scatter plot in Pandas, we need to use the value scatter with the parameter kind and mention the columns (specified by the parameters “x” and “y”) to be used for plotting in the argument list of the plot function. The graph in Figure 7-2 suggests that the two variables (“petal length” and “petal width”) are linearly correlated.
散点图可以帮助我们了解两个变量之间是否存在线性关系。要在 Pandas 中生成散点图,我们需要使用带有参数 kind 的值 scatter,并在 plot 函数的参数列表中提及用于绘图的列(由参数 "x "和 "y "指定)。图 7-2 中的图形表明两个变量("花瓣长度 "和 “花瓣宽度”)呈线性相关。
iris.plot(kind='scatter',x='petal length (cm)',y='petal width (cm)')
柱状图 Histogram
A histogram is used to visualize the frequency distribution with various bars representing the frequencies across various intervals (Figure 7-3). The value ‘hist’ is used with the kind parameter in the plot function to create histograms.
直方图用于直观显示频率分布,不同的条形图代表不同区间的频率(图 7-3)。在 plot 函数中,"hist "值与 kind 参数一起使用可创建直方图。
iris['sepal width (cm)'].plot(kind='hist')
As we can see from this histogram, the “sepal width” variable is normally distributed approximately.
从直方图中可以看出,"萼片宽度 "变量大致呈正态分布。
饼图 Pie charts
A pie chart shows different values that form a variable as sectors in a circle (Figure 7-4). Note that Pandas requires the value_counts function to calculate the number of values in each category as aggregation is not performed on its own when plotting is done in Pandas (we will later see that aggregation is taken care of if plotting is done using the Seaborn library). We need to use the value “pie” with the kind parameter to create pie charts.
饼图以圆圈中的扇形显示构成变量的不同值(图 7-4)。请注意,Pandas 需要使用 value_counts 函数来计算每个类别中的值的数量,因为在 Pandas 中绘制时不会自行进行聚合(稍后我们会看到,如果使用 Seaborn 库进行绘制,聚合会得到处理)。我们需要在种类参数中使用 "pie "值来创建饼图。
iris['species'].value_counts().plot(kind='pie')
We see that the three species (“virginica”, “setosa”, and “versicolor”) form equal parts of a circle, that is, they have the same number of values.
我们可以看到,这三个物种(“virginica”、"setosa "和 “versicolor”)构成了一个圆的等分部分,也就是说,它们的数值相同。
The Pandas plot method is very intuitive and easy to use. By merely changing the value of the kind parameter, we can plot a variety of graphs.
Pandas 的绘图方法非常直观且易于使用。只需改变种类参数的值,我们就能绘制出各种图形。
Further reading: See more about the kinds of plots that can be used in Pandas:https://pandas.pydata.org/pandas-docs/stable/user_guide/visualization.html#other-plots
进一步阅读: 查看 Pandas:https://pandas.pydata.org/pandas-docs/stable/user_guide/visualization.html#other-plots 中可使用的绘图类型的更多信息
今天的文章 《Python数据分析技术栈》第07章Python数据可视化 02 使用 Pandas 绘制曲线 Plotting using Pandas分享到此就结束了,感谢您的阅读。
版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 举报,一经查实,本站将立刻删除。
如需转载请保留出处:https://bianchenghao.cn/bian-cheng-ji-chu/103442.html