How to change index dtype of pandas DataFrame to i

2019-01-19 16:58发布

A default dtype of DataFrame index is int64 and I would like to change it to int32.

I tried changing it with pd.DataFrame.set_index and NumPy array of int32, also tried making new index with dtype=np.int32. It didn't work, always returning index of int64.

Can someone show a working code to produce Pandas index with int32 size?

I use conda Pandas v0.20.1.

3条回答
beautiful°
2楼-- · 2019-01-19 17:15

Can someone show a working code to produce pandas index with int32 size?

@PietroBattiston's answer may work. But it's worth explaining why you should ordinarily not want to replace the default RangeIndex with an Int64 / Int32 index.

Storing the logic behind a range of values takes less memory than storing each integer in a range. This should be clear when you compare, for instance, Python's built-in range with NumPy np.arange. As described in the pd.RangeIndex docs:

RangeIndex is a memory-saving special case of Int64Index limited to representing monotonic ranges. Using RangeIndex may in some instances improve computing speed.

查看更多
相关推荐>>
3楼-- · 2019-01-19 17:30

Not sure this is something worth doing in practice, but the following should work:

class Int32Index(pd.Int64Index):
    _default_dtype = np.int32

    @property
    def asi8(self):
        return self.values

i = Int32Index(np.array([...], dtype='int32'))

(from here)

查看更多
Root(大扎)
4楼-- · 2019-01-19 17:31

All of the code paths I could find, coerce the dtype:

Check in pandas.Index.__new__()

if issubclass(data.dtype.type, np.integer):
    from .numeric import Int64Index
    return Int64Index(data, copy=copy, dtype=dtype, name=name)

This allows passing a dtype, but in NumericIndex().__new__() we have:

if copy or not is_dtype_equal(data.dtype, cls._default_dtype):
    subarr = np.array(data, dtype=cls._default_dtype, copy=copy)

Which changes the dtype.

查看更多
登录 后发表回答