Recursive query used for transitive closure
You can simplify in several places (assuming acct_id
and parent_id
are NOT NULL
):
WITH RECURSIVE search_graph AS (
SELECT parent_id, ARRAY[acct_id] AS path
FROM account
UNION ALL
SELECT g.parent_id, sg.path || g.acct_id
FROM search_graph sg
JOIN account g ON g.acct_id = sg.parent_id
WHERE g.acct_id <> ALL(sg.path)
)
SELECT path[1] AS child
, path[array_upper(path,1)] AS parent
, path
FROM search_graph
ORDER BY path;
- The columns
acct_id
,depth
,cycle
are just noise in your query. - The
WHERE
condition has to exit the recursion one step earlier, before the duplicate entry from the top node is in the result. That was an "off-by-one" in your original.
The rest is formatting.
If you know the only possible circle in your graph is a self-reference, we can have that cheaper:
WITH RECURSIVE search_graph AS (
SELECT parent_id, ARRAY[acct_id] AS path, acct_id <> parent_id AS keep_going
FROM account
UNION ALL
SELECT g.parent_id, sg.path || g.acct_id, g.acct_id <> g.parent_id
FROM search_graph sg
JOIN account g ON g.acct_id = sg.parent_id
WHERE sg.keep_going
)
SELECT path[1] AS child
, path[array_upper(path,1)] AS parent
, path
FROM search_graph
ORDER BY path;
SQL Fiddle.
Note there would be problems (at least up to pg v9.4) for data types with a modifier (like varchar(5)
) because array concatenation loses the modifier but the rCTE insists on types matching exactly:
- Surprising results for data types with type modifier
Recursive query challenge - simple parent/child example
With help from RhodiumToad on #postgresql, I've arrived at this solution:
WITH RECURSIVE node_graph AS (
SELECT ancestor_node_id as path_start, descendant_node_id as path_end,
array[ancestor_node_id, descendant_node_id] as path
FROM node_relations
UNION ALL
SELECT ng.path_start, nr.descendant_node_id as path_end,
ng.path || nr.descendant_node_id as path
FROM node_graph ng
JOIN node_relations nr ON ng.path_end = nr.ancestor_node_id
)
SELECT * from node_graph order by path_start, array_length(path,1);
The result is exactly as expected.
Transitive-Closure Table Restructure
Yes, if you want to store the transitive closure, you need to store all paths.
For some operations, it's even helpful to store the path of length 0:
| component | component_of | depth |
|-----------|--------------|-------|
| D | D | 0 |
| D | A | 1 |
| D | B | 2 |
| C | C | 0 |
| B | B | 0 |
| B | A | 1 |
| A | A | 0 |
In MySQL 8.0, none of this will be needed. We'll finally be able to use recursive queries.
How to CONSTRUCT a property's transitive closure in SPARQL?
Problems in your code
So I tried to make subquery like this: …
I'm not sure what subquery syntax you're trying to use, but it's not legal. Subqueries are just wrapped in braces. E.g.,
select ?s where {
{ select ?x where { ... } }
}
The query you've written isn't legal.
But I obtained an error.
If you don't show us the error, we certainly can't help in fixing it. I expect that you got a syntax error. You can validate your query using sparql.org's query validator. That probably won't help too much in this case, though, because you can only use select queries in subqueries. (It would be very helpful, though, if construct queries were supported for subqueries.)
Transitive closures for construct queries
In general, you can't do arbitrary recursion in SPARQL. However, in the specific case that you've got, you can use property paths in the pattern to construct the transitive closure of a pattern. E.g.,
construct { ?a :partOf ?b }
where { ?a :partOf+ ?b }
This says that if there's a path of non-zero length from ?a to ?c using just :partOf links, then include a triple ?a :partOf ?b in the constructed output.
hierarchical data in a database: recursive query vs. closure tables vs. graph database
The whole closure table is redundant if you can use recursive queries :)
I think it's much better to have a complicated recursive query that you have to figure out once than deal with the extra IO (and disk space) of a separate table and associated triggers.
I have done some simple tests with recursive queries in postgres. With a few million rows in the table queries were still < 10ms for returning all parents of a particular child. Returning all children was fast too, depending on the level of the parent. It seemed to depend more on disk IO fetching the rows rather than the query speed itself. This was done single user, so not sure how it would perform under load. I suspect it would be very fast still if you can also hold most of the table in memory (and setup postgres correctly). Clustering the table by parent id also seemed to help.
detect cycles in a graph in SQL using recursive common table expression
CREATE TABLE #myEdge (
ID INT IDENTITY(1,1) PRIMARY KEY,
NodeIDFrom INT,
NodeIDTo INT
)
INSERT INTO #myEdge(NodeIDFrom, NodeIDTo) VALUES (4, 5),(5, 6),(6, 4);
DECLARE @rootNode AS integer = 4;
--compute the transitive closure from this root. column Niveau holds the recursion nesting level.
WITH cte_transitive_closure(rootNode, NodeFrom, NodeTo, Niveau)
AS (
SELECT @rootNode, NULL, @rootNode, 0
UNION ALL
SELECT rootNode, e.NodeIDFrom, e.NodeIDTo, Niveau+1
FROM cte_transitive_closure AS cte
JOIN #myEdge AS e ON cte.NodeTo=e.NodeIdFrom
WHERE Niveau < 99
)
SELECT *
INTO #transitive_closure
FROM cte_transitive_closure;
SELECT * FROM #transitive_closure;
--starting from root as a reached target, move back in the trace until the root is hit again.
WITH cte_cycle(NodeFrom, NodeTo, Cycle)
AS (
SELECT @rootNode,NULL,0
UNION ALL
SELECT t.NodeFrom, t.NodeTo, 0
FROM cte_cycle AS cte
JOIN #transitive_closure AS t ON cte.NodeFrom = t.NodeTo
WHERE t.NodeFrom != @rootNode AND Cycle=0
UNION ALL
SELECT t.NodeFrom, t.NodeTo, 1
FROM cte_cycle AS cte
JOIN #transitive_closure AS t ON cte.NodeFrom = t.NodeTo
WHERE t.NodeFrom = @rootNode AND Cycle=0
)
SELECT DISTINCT * FROM cte_cycle;
result set:
NodeFrom -> NodeTo (Cycle)
4 -> NULL (0)
4 -> 5 (1)
5 -> 6 (0)
6 -> 4 (0)
Related Topics
I Don't Understand Collation? (Mysql, Rdbms, Character Sets)
How to Insert Null Values into SQL Server
Remove Duplicates in a Django Query
Sqlite Format Number with 2 Decimal Places Always
Update Values in Identity Column
How to Find All Rows with a Null Value in Any Column Using Postgresql
Subtract One Day from Datetime
Why Doesn't SQL Server Support Unsigned Datatype
Is There a Coalesce-Like Function in Excel
SQL Join on Nearest Less Than Date
A How to Escape %% When Building Like Queries in Rails 3/Activerecord
Why Even Use *Db.Exec() or Prepared Statements in Golang
How to Select Bottom Most Rows
Sql: Combine Select Count(*) from Multiple Tables
How to Compare Data Between Two Table in Different Databases Using SQL Server 2008