I implemented a data virtualization solution using some ideas from CodePlex and the blog of Bea Stollnitz and Vincent Da Ven Berhge's paper (same link). However I needed a different approach so I decided to write my own solution.
I am using a DataGrid
to display about a million rows with this solution. I am using UI virtualization as well. My solution is feasible, but I experience some weird behavior in certain situations on how the DataGrid
requests data from its source.
About the solution
I ended up writing a list which does all the heavy work. It is a generic class named VirtualList<T>.
It implements the ICollectionViewFactory
interface, so the collection view creation mechanism can create a VirtualListCollectionView<T>
instance to wrap it. This class inherits from ListCollectionView
. I did not follow the suggestions to write my own ICollectionView
implementation. Inheriting seems to work fine as well.
The VirtualList<T>
splits the whole data into pages. It gets the total item count and every time the DataGrid
requests for a row via the list indexer it loads the appropriate page or returns it from the cache. The pages are recycled inside and a DispatcherTimer
disposes unused pages in idle time.
Data request patterns
The first thing I learned, that
VirtualList<T>
should implementIList
(non generic). Otherwise theItemsControl
will treat it as anIEnumerable
and query/enumerate all the rows. This is logical, since theDataGrid
is not type safe, so it cannot use theIList<T>
interface.The row with 0 index is frequently asked by the
DataGrid
. It is seem to be used for visual item measurement (according to the call stack). So, I simply cache this one.The caching mechanism inside the
DataGrid
uses a predictable pattern to query the rows it shows. First it asks for the visible rows from top to bottom (two times for every row), then it queries a couple of rows (depending on the size of the visible area) before the visible area (including the first visible row) in a descending order so, from bottom to top. After that it requests for a same amount of rows after the visible rows (including the last visible row) from top to bottom.If the visible row indexes are 4,5,6. The data request would be: 4,4,5,5,6,6,4,3,2,1,6,7,8,9.
If my page size is properly set, I can serve all these requests from the current and previously loaded page.
If
CanSelectMultipleItems
isTrue
and the user selects multiple items using the SHIFT button or mouse drag, theDataGrid
enumerates all the rows from the beginning of the list to the end of the selection. This enumeration happens via theIEnumerable
interface regardless of thatIList
is implemented or not.If the selected row is not visible and the current visible area is "far" from the selected row, sometimes DataGrid starts requesting all the items, from the selected row to the end of the visible area. Including all the rows in between which are not even visible. I could not figure out the exact pattern of this behavior. Maybe my implementation is the reason for that.
My questions
I am wondering, why the
DataGrid
requests for non visible rows, since those rows will be requested again when become visible?Why is it necessary to request every row two or three times?
Can anyone tell me how to make the DataGrid not to use
IEnumerable
, except turning off multiple item selection?