Hierarchical data in Linq - options and performanc

2019-01-04 10:28发布

I have some hierarchical data - each entry has an id and a (nullable) parent entry id. I want to retrieve all entries in the tree under a given entry. This is in a SQL Server 2005 database. I am querying it with LINQ to SQL in C# 3.5.

LINQ to SQL does not support Common Table Expressions directly. My choices are to assemble the data in code with several LINQ queries, or to make a view on the database that surfaces a CTE.

Which option (or another option) do you think will perform better when data volumes get large? Is SQL Server 2008's HierarchyId type supported in Linq to SQL?

9条回答
ら.Afraid
2楼-- · 2019-01-04 10:57

In MS SQL 2008 you could use HierarchyID directly, in sql2005 you may have to implement them manually. ParentID is not that performant on large data sets. Also check this article for more discussion on the topic.

查看更多
放我归山
3楼-- · 2019-01-04 11:00

I would set up a view and an associated table-based function based on the CTE. My reasoning for this is that, while you could implement the logic on the application side, this would involve sending the intermediate data over the wire for computation in the application. Using the DBML designer, the view translates into a Table entity. You can then associate the function with the Table entity and invoke the method created on the DataContext to derive objects of the type defined by the view. Using the table-based function allows the query engine to take your parameters into account while constructing the result set rather than applying a condition on the result set defined by the view after the fact.

CREATE TABLE [dbo].[hierarchical_table](
    [id] [int] IDENTITY(1,1) NOT NULL,
    [parent_id] [int] NULL,
    [data] [varchar](255) NOT NULL,
 CONSTRAINT [PK_hierarchical_table] PRIMARY KEY CLUSTERED 
(
    [id] ASC
)WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON) ON [PRIMARY]
) ON [PRIMARY]

CREATE VIEW [dbo].[vw_recursive_view]
AS
WITH hierarchy_cte(id, parent_id, data, lvl) AS
(SELECT     id, parent_id, data, 0 AS lvl
      FROM         dbo.hierarchical_table
      WHERE     (parent_id IS NULL)
      UNION ALL
      SELECT     t1.id, t1.parent_id, t1.data, h.lvl + 1 AS lvl
      FROM         dbo.hierarchical_table AS t1 INNER JOIN
                            hierarchy_cte AS h ON t1.parent_id = h.id)
SELECT     id, parent_id, data, lvl
FROM         hierarchy_cte AS result


CREATE FUNCTION [dbo].[fn_tree_for_parent] 
(
    @parent int
)
RETURNS 
@result TABLE 
(
    id int not null,
    parent_id int,
    data varchar(255) not null,
    lvl int not null
)
AS
BEGIN
    WITH hierarchy_cte(id, parent_id, data, lvl) AS
   (SELECT     id, parent_id, data, 0 AS lvl
        FROM         dbo.hierarchical_table
        WHERE     (id = @parent OR (parent_id IS NULL AND @parent IS NULL))
        UNION ALL
        SELECT     t1.id, t1.parent_id, t1.data, h.lvl + 1 AS lvl
        FROM         dbo.hierarchical_table AS t1 INNER JOIN
            hierarchy_cte AS h ON t1.parent_id = h.id)
    INSERT INTO @result
    SELECT     id, parent_id, data, lvl
    FROM         hierarchy_cte AS result
RETURN 
END

ALTER TABLE [dbo].[hierarchical_table]  WITH CHECK ADD  CONSTRAINT [FK_hierarchical_table_hierarchical_table] FOREIGN KEY([parent_id])
REFERENCES [dbo].[hierarchical_table] ([id])

ALTER TABLE [dbo].[hierarchical_table] CHECK CONSTRAINT [FK_hierarchical_table_hierarchical_table]

To use it you would do something like -- assuming some reasonable naming scheme:

using (DataContext dc = new HierarchicalDataContext())
{
    HierarchicalTableEntity h = (from e in dc.HierarchicalTableEntities
                                 select e).First();
    var query = dc.FnTreeForParent( h.ID );
    foreach (HierarchicalTableViewEntity entity in query) {
        ...process the tree node...
    }
}
查看更多
Rolldiameter
4楼-- · 2019-01-04 11:00

I have done this two ways:

  1. Drive the retrieval of each layer of the tree based on user input. Imagine a tree view control populated with the root node, the children of the root, and the grandchildren of the root. Only the root and the children are expanded (grandchildren are hidden with the collapse). As the user expands a child node the grandchildren of the root are display (that were previously retrieved and hidden), and a retrieval of all of the great-grandchildren is launched. Repeat the pattern for N-layers deep. This pattern works very well for large trees (depth or width) because it only retrieves the portion of the tree needed.
  2. Use a stored procedure with LINQ. Use something like a common table expression on the server to build your results in a flat table, or build an XML tree in T-SQL. Scott Guthrie has a great article about using stored procs in LINQ. Build your tree from the results when they come back if in a flat format, or use the XML tree if that is that is what you return.
查看更多
爷的心禁止访问
5楼-- · 2019-01-04 11:01

This option might also prove useful:

LINQ AsHierarchy() extension method
http://www.scip.be/index.php?Page=ArticlesNET18

查看更多
我想做一个坏孩纸
6楼-- · 2019-01-04 11:03

I am surprised nobody has mentioned an alternative database design - when hierarchy needs to be flattened from multiple levels and retrieved with high performance (not so considering storage space) it is better to use another entity-2-entity table to track hierarchy instead of parent_id approach.

It will allow not only single parent relations but also multi parent relations, level indications and different types of relationships:

CREATE TABLE Person (
  Id INTEGER,
  Name TEXT
);

CREATE TABLE PersonInPerson (
  PersonId INTEGER NOT NULL,
  InPersonId INTEGER NOT NULL,
  Level INTEGER,
  RelationKind VARCHAR(1)
);
查看更多
登录 后发表回答