Data preparation to upload into Redis server

2019-07-29 09:39发布

问题:

I have a 10GB .xml file, which I want to upload into redis server using the mass insert . I need advise on how to convert this .xml data to some key, value or any other data structure supported by redis? I am working with stack over flow dumps and for example, If I take up the comments.xml.

Data pattern: row Id="5" PostId="5" Score="9" Text="this is a super theoretical AI question. An interesting discussion! but out of place..." CreationDate="2014-05-14T00:23:15.437" UserId="34"

Lets say I want to retrieve all comments made by particular userid or a particular date how do I do that?

Firstly,

  1. How do I prepare this .xml date into data structure suitable for Redis.

  2. How can I upload it into Redis. I am using Redis on windows. The commands pipe and cat does not seem to work. I have tired using centos but I prefer using Redis on windows.

回答1:

Before you choose proper data structure you need to understand what type of quires you will make. For example if you have user specific data and you need to group different user activities per user and have aggregated results you need to go with different structures, build indexes, split data in chunks and so on.

Relatively for large amount of aggregated data (45GB) I found usable SortedSets with ZRANGE because it has better complexity that LRANGE. You can split your data in chunks based on your data size and process each ZRANGE individually in threads and then combine results.

On top of that structure you can add indexes with LISTS where you need only to iterate data for relatively small amounts of data.



标签: redis bigdata