“multi-index”的相关标签问题

0 投票

0 回答

184 浏览

csv - 使用 pandas 导入具有多索引列的 CSV 文件

我正在尝试使用 Pandas read_csv 导入具有多索引列的 cvs 数据，但是当我将 header 参数设置为整数列表header=[7,8]时，即使该列表只是一个整数时也会出错header=[7]。导入成功header=7，但在这种情况下，只有一级列名。

我知道我可以单独指定多索引列名称，但由于列很少，我希望有一个更有效的解决方案。

这是数据：

在 0.14 版中，这会产生一堆未命名的列，而不是所需的跨区列。

2013-12-05T09:37:17.863

0 投票

1 回答

3986 浏览

python - 在 Pandas 中，如何根据列的值对多索引的一个级别进行排序，同时保持另一级别的分组

我现在正在大学学习数据挖掘课程，但我有点卡在多索引排序问题上。

实际数据涉及大约 100 万条电影评论，我正在尝试根据美国邮政编码对其进行分析，但为了测试如何做我想做的事情，我一直在使用一个小得多的 250 个随机生成的数据集10 部电影的评分，而不是邮政编码，我使用的是年龄组。

所以这就是我现在所拥有的，它是 Pandas 中的多索引 DataFrame，有两个级别，“组”和“标题”

我的目标是根据组内的评分对标题进行排序（并且只显示每个组中最受欢迎的 5 个左右的标题）

所以像这样（但我只会在每组中显示两个标题）：

有人知道怎么做吗？我尝试过 sort_order、sort_index 等并交换级别，但它们也混淆了组。所以它看起来像：

我正在寻找这样的东西：Pandas 中的多索引排序，但不是基于另一个级别进行排序，而是我想根据值进行排序。有点像那个人想根据他的销售列进行排序。

谢谢！

python sorting pandas multi-index

2013-12-05T23:55:45.723

0 投票

1 回答

163 浏览

python - Importing multilevel indexed cvs data with pandas 0.13

I am trying to import CSV data with multi-indexed columns using Pandas 0.13 read_csv. The import is successful, but the resulting multi-index for the columns is surprising and not useful for me.

employment1976-1987thousands.csv:

The resulting columns index is

and I was hoping to get it in the form

python csv pandas multi-index

2013-12-09T14:37:46.730

0 投票

1 回答

171 浏览

boost - boost::multi_index 容器中迭代器功能投影的复杂性

有谁知道 boost::multi_index 库中迭代器投影的复杂性？文档可以在这里找到boost::multi_index projection of iterators但它没有说明操作的复杂性。

基本思想是，您可以检索索引内对象的迭代器，然后将其投影到第二个索引中，并获取同一对象但在第二个索引内的迭代器。如果这是一个 O(1) 操作，那么您可以有效地维护两个索引，一个是可快速搜索的，一个是较慢的。据我了解，迭代器的投影允许我在索引中找到一个搜索速度更快的对象，然后将其投影到搜索速度较慢的索引中。

我很想知道它是否是迭代器投影的简单 O(1) 查找，或者它是否有效地启动了第二个索引中的查找操作，因此取决于您投影到的特定索引并且速度较慢比 O(1)。

非常感谢您的帮助！

boost time-complexity multi-index boost-multi-index boost-iterators

2013-12-09T23:58:06.243

0 投票

3 回答

1393 浏览

python-2.7 - Replace column values in pandas multiindexed dataframe

I want to make a conditional replacement based on the first index value in my pandas dataframe. If I have a dataframe such as:

I think I should be able to replace value in column via:

But this returns the original dataframe, unchanged. Can anyone explain how this should be done and why my current method does not work?

python-2.7 pandas multi-index

2013-12-10T21:11:09.607

0 投票

1 回答

7050 浏览

python - 在多索引 pandas DataFrame 上选择一列

鉴于此数据框：

如何绘制图表：X 轴：1、2、3。三个系列名称分别为：bar、baz、foo。Y 轴值：“一”列。每个点旁边的标签是“二”列。

所以，换句话说，假设我有三只股票（bar、baz 和 foo），每只股票在每个日期（1、2、3）都有各自的股票价格（'one'），以及每个日期的评论点位于“二”列。我怎么能把它画出来？

（抱歉没有显示df表，不知道如何正确复制）

python matplotlib pandas multi-index

2013-12-13T15:55:45.247

0 投票

1 回答

3452 浏览

python - 通过索引和列合并/加入/附加两个带有 MultiIndex 列的 Pandas DataFrame

我一直在用头撞我的桌子，不知道是否有办法，也许我正在尝试一些不可能的事情。

我有两个带有 MultiIndex 列（三个级别）和时间索引（单级）的 DataFrame。第一个是这样的：

对于这个 DF，我想添加以下内容：

考虑行和列索引。结果应该是：

我尝试了几种方法，包括使用 Merge 和 Join，但无法使其正常工作。

有任何想法吗？提前致谢。

PS我可以发布我用来生成上面两个DF的代码，如果有帮助的话，但它有点长。但无论如何，我正在寻找一个通用的答案，列的确切名称或行的索引是无关紧要的（甚至可能是整数的索引）。

python pandas dataframe multi-index

2013-12-20T11:31:44.717

0 投票

4 回答

3579 浏览

python - 在 Pandas 中对行和列 MultiIndex 使用布尔索引

问题在最后，以粗体显示。但首先，让我们设置一些数据：

这给出了：

我想提取这个 DataFrame 的一个子集以供后续分析。假设我想切出jersey数字为 71 的行。我真的不喜欢使用xs来执行此操作的想法。当你做一个横截面时，xs你会丢失你选择的列。如果我运行：

然后我取回正确的行，但我丢失了jersey列。

xs_slice

此外，对于我想从列中xs获取一些不同值的情况，这似乎不是一个很好的解决方案。jersey我认为一个更好的解决方案是在这里找到的：

boolean_slice_1

您甚至可以过滤球衣和球队的组合：

boolean_slice_2

好的！

所以问题是：我怎样才能做类似的事情来选择列的子集。 例如，假设我只想要代表 Ralph 数据的列。我怎么能不使用xs呢？或者如果我只想要带有的列observer in ['John', 'Ralph']怎么办？同样，我真的更喜欢一种解决方案，它可以在结果中保留所有级别的行和列索引......就像上面的布尔索引示例一样。

我可以做我想做的事，甚至可以组合来自行索引和列索引的选择。但我发现的唯一解决方案涉及一些真正的体操：

double_boolean_slice

因此第二个问题：有没有更紧凑的方法来做我刚才所做的事情？

python pandas multi-index

2013-12-24T04:10:12.893

0 投票

2 回答

927 浏览

python - Python & Numpy：使用 multi_index 迭代特定轴？

我有一个带有五个轴的数组：

我想使用此链接multi_index的第二个示例中的方法对其进行迭代，但不是迭代整个 5 个维度，而是要迭代前三个维度。一种 Pythonic 的方式（不涉及 Numpy）将是这样的：

如何在纯 Numpy 中实现这一点？

python loops numpy axes multi-index

2013-12-29T00:56:16.340

0 投票

1 回答

2229 浏览

python - Pandas how to access a multi-index DataFrame to select values to make a heatmap..!

thanks for taking time to click on this..!

Right I'm trying to make a heatmap..

The problem I have is I have data like this:

please see screen show of DataFrame:

These values are all unique, meaning that there is only one profit for each combination of a & b
I want to : plot a heatmap of x,y and the colour is the z value..

My x-coordinate is a list that goes from 1.01 to 2.0 at 0.01 intervals

My y-coordinate is a list that does the same e.g. [1.01,1.02,1.03.. 1.99, 2.0] So I make a meshgrid of this x & y

My z coordinate I want to be 'profit' from the column above, when x = a and b = y, plot the profit

But I'm having problems vectorising, extracting profit using x & y with a meshgrid and looking up profit when a = x and b = y.. I'll explain some more

Make a meshgrid with x & y For all combinations of x & y, find a & b and lookup profit, using a = x and b = y If no such combination exists, return 0 If the combination exists, if profit is +, plot a green point If profit is -, plot a red point. The bigger the profit, the brighter the green The bigger the loss, the brighter the red, Zero can be represented by eg blue

Here was an attempt as some code I did.. I've changed it round so much trying different things that it prob wont make much sense, but you will get a gist of what I've been trying to do..

Guys, I've been playing round with this all day, reading stuff on line trying lookup(), also taking out the hierarahical index and just having a dataframe with x, y and profit columns.. but I keep getting silly errors, err.. I don't actually know what I'm doing!

Can anyone shed some light on this? Would be very much appreciated!!!!!

P.s. Happy New Year 2014 guys!

python indexing pandas heatmap multi-index

2014-01-03T18:07:55.093

问题标签 [multi-index]

Reference