Releases: neo4j/graph-data-science
GDS 1.6.0
Release Date: 27 May 2021
GDS 1.6 is compatible with Neo4j 4.0, 4.1, and 4.2 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.
Breaking changes
- Degree centrality has been promoted to the product tier
- Added procedures:
gds.degree.stream.estimate
gds.degree.write.estimate
gds.degree.mutate
gds.degree.mutate.estimate
gds.degree.stats
gds.degree.stats.estimate
- Removed alpha procedures:
gds.alpha.degree.stream
Gds.alpha.degree.write
- Added procedures:
- Article Rank has been promoted to the product tier
- Added procedures:
gds.articleRank.stream
gds.articleRank.stream.estimate
gds.articleRank.write
gds.articleRank.write.estimate
gds.articleRank.mutate
gds.articleRank.mutate.estimate
gds.articleRank.stats
gds.articleRank.stats.estimate
- Removed alpha procedures:
gds.alpha.articleRank.stream
gds.alpha.articleRank.write
- Added procedures:
- Eigenvector Centrality has been promoted to the product tier
- Added procedures:
gds.eigenvector.stream
gds.eigenvector.stream.estimate
gds.eigenvector.write
gds.eigenvector.write.estimate
gds.eigenvector.mutate
gds.eigenvector.mutate.estimate
gds.eigenvector.stats
gds.eigenvector.stats.estimate
- Removed alpha procedures:
gds.alpha.eigenvector.stream
Gds.alpha.eigenvector.write
- Added procedures:
- AStar has been promoted to the product tier
- Added procedures:
gds.astar.stream
gds.astar.stream.estimate
gds.astar.write
gds.astar.write.estimate
gds.astar.mutate
gds.astar.mutate.estimate
- Removed alpha procedures:
gds.beta.astar.stream
gds.beta.astar.stream.estimate
gds.beta.astar.write
gds.beta.astar.write.estimate
gds.beta.astar.mutate
gds.beta.astar.mutate.estimate
- The parameter
path
was removed. The path computation is controlled by the YIELD.
- Added procedures:
- Yens K Shortest Paths has been promoted to the product tier:
- Added procedures:
gds.yens.stream
gds.yens.stream.estimate
gds.yens.write
gds.yens.write.estimate
gds.yens.mutate
gds.yens.mutate.estimate
- Removed alpha procedures:
gds.beta.yens.stream
gds.beta.yens.stream.estimate
gds.beta.yens.write
gds.beta.yens.write.estimate
gds.beta.yens.mutate
gds.beta.yens.mutate.estimate
- The parameter
path
was removed. The path computation is controlled by the cypher YIELD sub-clause.
- Added procedures:
- Dijkstra Source-Target has been promoted to the product tier:
- Added procedures:
gds.shortestPath.dijkstra.stream
gds.shortestPath.dijkstra.stream.estimate
gds.shortestPath.dijkstra.write
gds.shortestPath.dijkstra.write.estimate
gds.shortestPath.dijkstra.mutate
gds.shortestPath.dijkstra.mutate.estimate
- Removed alpha procedures:
gds.beta.shortestPath.dijkstra.stream
gds.beta.shortestPath.dijkstra.stream.estimate
gds.beta.shortestPath.dijkstra.write
gds.beta.shortestPath.dijkstra.write.estimate
gds.beta.shortestPath.dijkstra.mutate
gds.beta.shortestPath.dijkstra.mutate.estimate
- The parameter
path
was removed. The path computation is controlled by the cypher YIELD sub-clause.
- Added procedures:
- Dijkstra Single-Source has been promoted to the product tier:
- Added procedures:
gds.allShortestPath.dijkstra.stream
gds.allShortestPath.dijkstra.stream.estimate
gds.allShortestPath.dijkstra.write
gds.allShortestPath.dijkstra.write.estimate
gds.allShortestPath.dijkstra.mutate
gds.allShortestPath.dijkstra.mutate.estimate
- Removed alpha procedures:
gds.beta.allShortestPath.dijkstra.stream
gds.beta.allShortestPath.dijkstra.stream.estimate
gds.beta.allShortestPath.dijkstra.write
gds.beta.allShortestPath.dijkstra.write.estimate
gds.beta.allShortestPath.dijkstra.mutate
gds.beta.allShortestPath.dijkstra.mutate.estimate
- The parameter
path
was removed. The path computation is controlled by the cypher YIELD sub-clause.
- Added procedures:
- Node2Vec has been promoted to the beta tier
- Added procedures:
gds.beta.node2vec.stream
gds.beta.node2vec.stream.estimate
gds.beta.node2vec.write
gds.beta.node2vec.write.estimate
gds.beta.node2vec.mutate
gds.beta.node2vec.mutate.estimate
- Removed alpha procedures:
gds.alpha.node2vec.stream
gds.alpha.node2vec.write
- Added procedures:
- The parameter
centerSamplingFactor
is renamed topositiveSamplingFactor
- The parameter
contextSamplingExponent
is renamed tonegativeSamplingExponent
maxStreakCount
configuration parameter is renamed topatience
. It is used in the train modes of Node Classification and Link Prediction.maxIterations
andminIterations
configuration parameters are renamed tomaxEpochs
andminEpochs
. It is used in the train modes of Node Classification and Link Prediction.windowSize
configuration parameters is removed from the train modes of Node Classification and Link Prediction.
gds.alpha.ml.linkPrediction.train
configuration parameter classRatio
is renamed to negativeClassWeight
. It is also mandatory now.
degreeAsProperty
configuration parameter from GraphSAGE
- The same effect can be achieved by using
gds.degree.mutate
and use the mutated property as feature for GraphSAGE training. - Important: GraphSAGE models persisted with earlier versions of GDS are not compatible with this version.
New features
- New ScaleProperties procedures to transform and scale node properties. Available scalers: Min-max, Max, Mean, Log, Standard Score, L1 Norm, L2 Norm
gds.alpha.scaleProperties.stream
gds.alpha.scaleProperties.mutate
- Added ability to create new in-memory graphs by filtering existing named graphs based on node and relationship properties with new catalog procedure
gds.beta.graph.create.subgraph
- Two new centrality algorithms for influence maximization were contributed by community member @xkitsios
gds.alpha.influenceMaximization.celf.stream
gds.alpha.influenceMaximization.greedy.stream
- Link Prediction:
- Added support for storing, loading and publishing Link Prediction models.
- Added progress logging for
gds.alpha.ml.linkPrediction.train
andgds.alpha.ml.linkPrediction.predict
. - Added write and stream modes to
gds.alpha.ml.linkPrediction.predict
gds.alpha.ml.linkPrediction.stream
gds.alpha.ml.linkPrediction.write
- Added estimate mode for Link Prediction:
gds.alpha.ml.linkPrediction.train.estimate
gds.alpha.ml.lin...
GDS 1.6 Preview
Release Date: 20 May 2021
GDS 1.6 is compatible with Neo4j 4.0, 4.1, and 4.2 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.
Breaking changes
- Degree centrality has been promoted to the product tier
- Added procedures:
gds.degree.stream.estimate
gds.degree.write.estimate
gds.degree.mutate
gds.degree.mutate.estimate
gds.degree.stats
gds.degree.stats.estimate
- Removed alpha procedures:
gds.alpha.degree.stream
Gds.alpha.degree.write
- Added procedures:
- Article Rank has been promoted to the product tier
- Added procedures:
gds.articleRank.stream
gds.articleRank.stream.estimate
gds.articleRank.write
gds.articleRank.write.estimate
gds.articleRank.mutate
gds.articleRank.mutate.estimate
gds.articleRank.stats
gds.articleRank.stats.estimate
- Removed alpha procedures:
gds.alpha.articleRank.stream
gds.alpha.articleRank.write
- Added procedures:
- Eigenvector Centrality has been promoted to the product tier
- Added procedures:
gds.eigenvector.stream
gds.eigenvector.stream.estimate
gds.eigenvector.write
gds.eigenvector.write.estimate
gds.eigenvector.mutate
gds.eigenvector.mutate.estimate
gds.eigenvector.stats
gds.eigenvector.stats.estimate
- Removed alpha procedures:
gds.alpha.eigenvector.stream
Gds.alpha.eigenvector.write
- Added procedures:
- AStar has been promoted to the product tier
- Added procedures:
gds.astar.stream
gds.astar.stream.estimate
gds.astar.write
gds.astar.write.estimate
gds.astar.mutate
gds.astar.mutate.estimate
- Removed alpha procedures:
gds.beta.astar.stream
gds.beta.astar.stream.estimate
gds.beta.astar.write
gds.beta.astar.write.estimate
gds.beta.astar.mutate
gds.beta.astar.mutate.estimate
- The parameter
path
was removed. The path computation is controlled by the YIELD.
- Added procedures:
- Yens K Shortest Paths has been promoted to the product tier:
- Added procedures:
gds.yens.stream
gds.yens.stream.estimate
gds.yens.write
gds.yens.write.estimate
gds.yens.mutate
gds.yens.mutate.estimate
- Removed alpha procedures:
gds.beta.yens.stream
gds.beta.yens.stream.estimate
gds.beta.yens.write
gds.beta.yens.write.estimate
gds.beta.yens.mutate
gds.beta.yens.mutate.estimate
- The parameter
path
was removed. The path computation is controlled by the cypher YIELD sub-clause.
- Added procedures:
- Dijkstra Source-Target has been promoted to the product tier:
- Added procedures:
gds.shortestPath.dijkstra.stream
gds.shortestPath.dijkstra.stream.estimate
gds.shortestPath.dijkstra.write
gds.shortestPath.dijkstra.write.estimate
gds.shortestPath.dijkstra.mutate
gds.shortestPath.dijkstra.mutate.estimate
- Removed alpha procedures:
gds.beta.shortestPath.dijkstra.stream
gds.beta.shortestPath.dijkstra.stream.estimate
gds.beta.shortestPath.dijkstra.write
gds.beta.shortestPath.dijkstra.write.estimate
gds.beta.shortestPath.dijkstra.mutate
gds.beta.shortestPath.dijkstra.mutate.estimate
- The parameter
path
was removed. The path computation is controlled by the cypher YIELD sub-clause.
- Added procedures:
- Dijkstra Single-Source has been promoted to the product tier:
- Added procedures:
gds.allShortestPath.dijkstra.stream
gds.allShortestPath.dijkstra.stream.estimate
gds.allShortestPath.dijkstra.write
gds.allShortestPath.dijkstra.write.estimate
gds.allShortestPath.dijkstra.mutate
gds.allShortestPath.dijkstra.mutate.estimate
- Removed alpha procedures:
gds.beta.allShortestPath.dijkstra.stream
gds.beta.allShortestPath.dijkstra.stream.estimate
gds.beta.allShortestPath.dijkstra.write
gds.beta.allShortestPath.dijkstra.write.estimate
gds.beta.allShortestPath.dijkstra.mutate
gds.beta.allShortestPath.dijkstra.mutate.estimate
- The parameter
path
was removed. The path computation is controlled by the cypher YIELD sub-clause.
- Added procedures:
- Node2Vec has been promoted to the beta tier
- Added procedures:
gds.beta.node2vec.stream
gds.beta.node2vec.stream.estimate
gds.beta.node2vec.write
gds.beta.node2vec.write.estimate
gds.beta.node2vec.mutate
gds.beta.node2vec.mutate.estimate
- Removed alpha procedures:
gds.alpha.node2vec.stream
gds.alpha.node2vec.write
- Added procedures:
- The parameter
centerSamplingFactor
is renamed topositiveSamplingFactor
- The parameter
contextSamplingExponent
is renamed tonegativeSamplingExponent
maxStreakCount
configuration parameter is renamed topatience
. It is used in the train modes of Node Classification and Link Prediction.maxIterations
andminIterations
configuration parameters are renamed tomaxEpochs
andminEpochs
. It is used in the train modes of Node Classification and Link Prediction.windowSize
configuration parameters is removed from the train modes of Node Classification and Link Prediction.
gds.alpha.ml.linkPrediction.train
configuration parameter classRatio
is renamed to negativeClassWeight
. It is also mandatory now.
degreeAsProperty
configuration parameter from GraphSAGE
- The same effect can be achieved by using
gds.degree.mutate
and use the mutated property as feature for GraphSAGE training. - Important: GraphSAGE models persisted with earlier versions of GDS are not compatible with this version.
New features
- New ScaleProperties procedures to transform and scale node properties. Available scalers: Min-max, Max, Mean, Log, Standard Score, L1 Norm, L2 Norm
gds.alpha.scaleProperties.stream
gds.alpha.scaleProperties.mutate
- Added ability to create new in-memory graphs by filtering existing named graphs based on node and relationship properties with new catalog procedure
gds.beta.graph.create.subgraph
- Two new centrality algorithms for influence maximization were contributed by community member @xkitsios
gds.alpha.influenceMaximization.celf.stream
gds.alpha.influenceMaximization.greedy.stream
- Link Prediction:
- Added support for storing, loading and publishing Link Prediction models.
- Added progress logging for
gds.alpha.ml.linkPrediction.train
andgds.alpha.ml.linkPrediction.predict
. - Added write and stream modes to
gds.alpha.ml.linkPrediction.predict
gds.alpha.ml.linkPrediction.stream
gds.alpha.ml.linkPrediction.write
- Added estimate mode for Link Prediction:
gds.alpha.ml.linkPrediction.train.estimate
gds.alpha.ml.lin...
1.5.2
Release Date: 11 May 2021
GDS 1.5 is compatible with Neo4j 4.0, 4.1, and 4.2 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.
Bug fixes
- Fixed a bug in FastRPExtended concerning implementation internals, especially when propertyDimesion == embeddingDimension output contained NaNs.
- Fixed a bug where Alpha similarity algorithms in some cases could fail on division by 0 when writing results back.
- Fixed an issue where gds.graph.drop could take a long time when the graph contained node embeddings.
- Fixed a bug where gds.beta.graphSage.train was failing in the presence of array properties.
1.5.1
Release Date: 3 March, 2021
GDS 1.5 is compatible with Neo4j 4.0 and 4.1, but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.
Bug fixes
- Fixed a bug which caused
gds.graph.list
andgds.graph.drop
to throw an error when specifying a graph with duplicate property keys by failing early. - Fixed potential ArrayIndexOutOfBoundsException when running
gds.triangleCount
on a relationship-filtered graph. - Fixed a bug that can lead to inconsistencies when writing or mutating new relationships created from a label-filtered graph.
Improvements
- Progress logging: Removed a "disabled" log message from the database startup when GDS was running in its default configuration. It is replaced with a more elaborate "enabled" message when the progress tracking feature is enabled.
- We now return the name of the current database in the error message if graph is not found.
1.5.0
Release Date: 9 February, 2021
GDS 1.5 is compatible with Neo4j 4.0 and 4.1, but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.
Breaking changes
- Promote several shortest path algorithms to
beta
tier: Dijkstra, A*, and Yens k-shortest paths. The APIs have been standardized, and all include the ability to return source/target nodes, nodes traversed, and paths.- This adds procedures
gds.beta.shortestPath.dijkstra.mutate
gds.beta.shortestPath.dijkstra.mutate.estimate
gds.beta.shortestPath.dijkstra.stream
gds.beta.shortestPath.dijkstra.stream.estimate
gds.beta.shortestPath.dijkstra.write
gds.beta.shortestPath.dijkstra.write.estimate
gds.beta.shortestPath.astar.mutate
gds.beta.shortestPath.astar.mutate.estimate
gds.beta.shortestPath.astar.stream
gds.beta.shortestPath.astar.stream.estimate
gds.beta.shortestPath.astar.write
gds.beta.shortestPath.astar.write.estimate
gds.beta.shortestPath.yens.mutate
gds.beta.shortestPath.yens.mutate.estimate
gds.beta.shortestPath.yens.stream
gds.beta.shortestPath.yens.stream.estimate
gds.beta.shortestPath.yens.write
gds.beta.shortestPath.yens.write.estimate
gds.beta.allShortestPaths.dijkstra.mutate
gds.beta.allShortestPaths.dijkstra.mutate.estimate
gds.beta.allShortestPaths.dijkstra.stream
gds.beta.allShortestPaths.dijkstra.stream.estimate
gds.beta.allShortestPaths.dijkstra.write
gds.beta.allShortestPaths.dijkstra.write.estimate
- And removes alpha procedures
gds.alpha.shortestPath.stream
gds.alpha.shortestPath.write
gds.alpha.shortestPath.astar.stream
gds.alpha.kShortestPaths.stream
gds.alpha.kShortestPaths.write
gds.alpha.shortestPaths.stream
gds.alpha.shortestPaths.write
- This adds procedures
- GDS will now throw an error when a user tries to use a mutate procedure on graphs not stored in the graph catalog (anonymous graphs)
New Features
- Introduced machine learning based multi-class node classification procedures:
- Add
gds.alpha.ml.nodeClassification.train
to train a model to predict a node label - Add
gds.alpha.ml.nodeClassification.predict.mutate
to make predictions using a trained model
- Add
- Introduced machine learning based link prediction procedures:
- Add
gds.alpha.linkPrediction.train
procedure for training Link Prediction models. - Added
gds.alpha.linkPrediction.predict.mutate
procedure for predicting relationships based on a trained Link Prediction model.
- Add
- Added support for list properties as features for
gds.alpha.nodeClassification
gds.beta.fastRPExtended
gds.beta.graphSage
- Added support for storing trained models on disk (Enterprise only)
gds.alpha.model.store
gds.alpha.model.load
gds.alpha.model.delete
- Added procedure for publishing trained models (Enterprise only)
gds.alpha.model.publish
- Added HITS algorithm to the alpha tier
gds.alpha.hits.mutate
andgds.alpha.hits.mutate.estimate
gds.alpha.hits.stats
andgds.alpha.hits.stats.estimate
gds.alpha.hits.stream
andgds.alpha.hits.stream.estimate
gds.alpha.hits.write
andgds.alpha.hits.write.estimate
- Added Speaker-Listener Label Propagation Algorithm (SLLPA) to the alpha tier
gds.alpha.sllpa.mutate
andgds.alpha.sllpa.mutate.estimate
gds.alpha.sllpa.stats
andgds.alpha.sllpa.stats.estimate
gds.alpha.sllpa.stream
andgds.alpha.sllpa.stream.estimate
gds.alpha.sllpa.write
andgds.alpha.sllpa.write.estimate
- Added CSV export capabilities with the
gds.beta.graph.export.csv
procedure to allow users to export their in-memory graph to CSV - Add message reducer capability to Pregel framework to improve memory consumption and computation runtime.
- Added a progress logging procedure with
gds.beta.listProgress
, to return status of running algorithms. This is turned off by default, but can be enabled withgds.progress_tracking_enabled
in the config. - Add a new
BitIdMap
data structure to represent node id mappings (Enterprise only)- The data structure can lead to a significant reduction in required heap space for an in-memory graph.
- The data structure is used for native graph projections and in some algorithms, e.g., Louvain.
- The data structure is not used in Cypher projections.
- The feature is enabled by default on GDS Enterprise Edition and can be disabled using the
USE_BIT_ID_MAP
feature toggle.
Bug fixes
- Adding projection parameters as additional configuration in
gds.graph.create
andgds.graph.create.cypher
will throw an exception if improperly configured, instead of being silently ignored. - Fixed a bug in
gds.alpha.articleRank
where centrality scores were not normalized correctly - Fixed a bug in path stream procedures where the path object (
path: true
) used incorrect node identifiers. - Fixed a bug in path write procedures where the relationship property
nodeIds
contained incorrect node identifiers. - Fixed a race condition that could cause exceptions thrown by scheduled tasks to be supressed.
Improvements
- Improved progress logging to write progress per individual node label in
gds.graph.writeNodeProperties
. - When a named graph does not exist, the graph catalog will display similarly named stored graphs.
- When a saved model does not exist, the model catalog will display similarly named stored graphs.
- Added
centralityDistribution
to the return fields for the write mode of the alpha centrality algorithms. gds.beta.graph.generate
usingrelationshipDistribution: 'POWER_LAW'
applies the distribution to the native orientation.- Added
centralityDistribution
as a return field ingds.betweenness.[write/mutate/stats]
- Added
getNeighbours
andisMultiGraph
to the Pregel-API. - Added new message queue implementations for the Pregel framework, which
- replace the previously used JCTools queue and work with primitive double arrays instead of boxed values.
- lead to 3x to 5x faster runtimes for Pregel based algorithms.
- reduce GC pressure due to less object allocations which leads to more predictable runtimes.
- support synchronous and asynchronous Pregel computations.
Other Changes
- The PageRank configuration parameter
cacheWeights
has been deprecated. The parameter had no effect. - Deprecated
minimumScore, maximumScore, scoreSum
return fields ingds.betweenness.[write/mutate/stats]
GDS 1.5 Preview
Release date: 29 January, 2021
Warning: This is a preview release and not intended for production use. If you have any feedback, please let us know: https://github.com/neo4j/graph-data-science/issues
GDS 1.5 is compatible with Neo4j 4.0 and 4.1, but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.
Breaking changes
- Promote several shortest path algorithms to
beta
tier: Dijkstra, A*, and Yens k-shortest paths. The APIs have been standardized, and all include the ability to return source/target nodes, nodes traversed, and paths.- This adds procedures
gds.beta.shortestPath.dijkstra.mutate
gds.beta.shortestPath.dijkstra.mutate.estimate
gds.beta.shortestPath.dijkstra.stream
gds.beta.shortestPath.dijkstra.stream.estimate
gds.beta.shortestPath.dijkstra.write
gds.beta.shortestPath.dijkstra.write.estimate
gds.beta.shortestPath.astar.mutate
gds.beta.shortestPath.astar.mutate.estimate
gds.beta.shortestPath.astar.stream
gds.beta.shortestPath.astar.stream.estimate
gds.beta.shortestPath.astar.write
gds.beta.shortestPath.astar.write.estimate
gds.beta.shortestPath.yens.mutate
gds.beta.shortestPath.yens.mutate.estimate
gds.beta.shortestPath.yens.stream
gds.beta.shortestPath.yens.stream.estimate
gds.beta.shortestPath.yens.write
gds.beta.shortestPath.yens.write.estimate
gds.beta.allShortestPaths.dijkstra.mutate
gds.beta.allShortestPaths.dijkstra.mutate.estimate
gds.beta.allShortestPaths.dijkstra.stream
gds.beta.allShortestPaths.dijkstra.stream.estimate
gds.beta.allShortestPaths.dijkstra.write
gds.beta.allShortestPaths.dijkstra.write.estimate
- And removes alpha procedures
gds.alpha.shortestPath.stream
gds.alpha.shortestPath.write
gds.alpha.shortestPath.astar.stream
gds.alpha.kShortestPaths.stream
gds.alpha.kShortestPaths.write
gds.alpha.shortestPaths.stream
gds.alpha.shortestPaths.write
- This adds procedures
- GDS will now throw an error when a user tries to use a mutate procedure on graphs not stored in the graph catalog (anonymous graphs)
New Features
- Introduced machine learning based multi-class node classification procedures:
- Add
gds.alpha.ml.nodeClassification.train
to train a model to predict a node label - Add
gds.alpha.ml.nodeClassification.predict.mutate
to make predictions using a trained model
- Add
- Introduced machine learning based link prediction procedures:
- Add
gds.alpha.linkPrediction.train
procedure for training Link Prediction models. - Added
gds.alpha.linkPrediction.predict.mutate
procedure for predicting relationships based on a trained Link Prediction model.
- Add
- Added support for list properties as features for
gds.alpha.nodeClassification
gds.beta.fastRPExtended
gds.beta.graphSage
- Added support for storing trained models on disk (Enterprise only)
gds.alpha.model.store
gds.alpha.model.load
gds.alpha.model.delete
- Added procedure for publishing trained models (Enterprise only)
Gds.alpha.model.publish
- Added HITS algorithm to the alpha tier
gds.alpha.hits.mutate
andgds.alpha.hits.mutate.estimate
gds.alpha.hits.stats
andgds.alpha.hits.stats.estimate
gds.alpha.hits.stream
andgds.alpha.hits.stream.estimate
gds.alpha.hits.write
andgds.alpha.hits.write.estimate
- Added Speaker-Listener Label Propagation Algorithm (SLLPA) to the alpha tier
gds.alpha.sllpa.mutate
andgds.alpha.sllpa.mutate.estimate
gds.alpha.sllpa.stats
andgds.alpha.sllpa.stats.estimate
gds.alpha.sllpa.stream
andgds.alpha.sllpa.stream.estimate
gds.alpha.sllpa.write
andgds.alpha.sllpa.write.estimate
- Added CSV export capabilities with the
gds.beta.graph.export.csv
procedure to allow users to export their in-memory graph to CSV - Added a progress logging procedure with
gds.beta.listProgress
, to return status of running algorithms. This is turned off by default, but can be enabled withgds.progress_tracking_enabled
in the config. - Add message reducer capability to Pregel framework to improve memory consumption and computation runtime.
- Add a new
BitIdMap
data structure to represent node id mappings (Enterprise only)- The data structure can lead to a significant reduction in required heap space for an in-memory graph.
- The data structure is used for native graph projections and in some algorithms, e.g., Louvain.
- The data structure is not used in Cypher projections.
- The feature is enabled by default on GDS Enterprise Edition and can be disabled using the
USE_BIT_ID_MAP
feature toggle.
Bug fixes
- Adding projection parameters as additional configuration in
gds.graph.create
andgds.graph.create.cypher
will throw an exception if improperly configured, instead of being silently ignored. - Fixed a bug in
gds.alpha.articleRank
where centrality scores were not normalized correctly
Improvements
- Improved progress logging to write progress per individual node label in
gds.graph.writeNodeProperties
. - When a named graph does not exist, the graph catalog will display similarly named stored graphs.
- When a saved model does not exist, the model catalog will display similarly named stored graphs.
- Add
centralityDistribution
to the return fields for the write mode of the alpha centrality algorithms. gds.beta.graph.generate
usingrelationshipDistribution: 'POWER_LAW'
applies the distribution to the native orientation.- Add
getNeighbours
andisMultiGraph
to the Pregel-API. - Add
centralityDistribution
as a return field ingds.betweenness.[write/mutate/stats]
Other Changes
- The PageRank configuration parameter
cacheWeights
has been deprecated. The parameter had no effect. - Deprecate
minimumScore, maximumScore, scoreSum
return fields ingds.betweenness.[write/mutate/stats]
GDS 1.4.1
Release date: 7 December, 2020
GDS 1.4.1 is compatible with Neo4j 4.0 and 4.1, but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.
Bug fixes
- Fixed a bug in progress logging for
gds.graph.writeNodeProperties()
andgds.graph.writeRelationships()
where some percentages were missed, or others reported multiple times. - Fixed a bug where
gds.graph.writeNodeProperties()
andgds.alpha.shortestPathDeltaStepping.write()
were single threaded by default - Fixed a bug where
gds.alpha.node2vec
ignored relationships for graphs with multiple projected relationship types. - Fixed a bug where
gds.pagerank.*.estimate
would fail for very large node counts. - Fixed a bug where using float array node properties (e.g. after running
gds.fastRP.mutate
) would fail in some situations. - Fixed a bug where a graph with multiple labels and all nodes sharing at least one label could lead to either an exception or a wrongly mapped Neo4j id.
Improvements
gds.pageRank
will now select batches more dynamically to properly respect the requested concurrency.
1.3.5
Release date: 23 November, 2020
GDS 1.3.5 is compatible with Neo4j 4.0 and 4.1, but not Neo4j 3.5.x or 4.2. For a 3.5 compatible release, please see GDS 1.1.6. For a 4.2 compatible release, please see GDS 1.4.0
See also 1.3.0 release notes, 1.3.1 release notes, 1.3.2 release notes, 1.3.3 release notes, and 1.3.4 release notes,
Bug fixes
- Fixed a bug in
gds.graph.export
where at most one relationship property per relationship type would be exported. - Fixed a bug in Louvain where changes to
maxIterations
were ignored. - Fixed a bug where
gds.alpha.node2vec
would ignore relationships for graphs with multiple projected relationship types.
GDS 1.4.0
Release date: 5 November, 2020
GDS 1.4.0 is compatible with Neo4j 4.0 and 4.1, but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.
Breaking changes
- License key configuration was renamed from
licenseFile
tolicense_file
for consistency with Bloom - Removed sparsity parameter from
gds.alpha.randomProjection.*
- Renamed
gds.alpha.randomProjection
togds.fastRP
due to productization. - Renamed
embeddingSize
parameter toembeddingDimension
for fastRP, GraphSAGE and Node2Vec. - Renamed
projectedFeatureSize
toprojectedFeatureDimension
for GraphSAGE - Renamed
nodePropertyNames
has been renamed tofeatureProperties
ingds.beta.fastRPExtended
andgds.beta.graphSage.train
- Renamed
gds.alpha.randomProjection
togds.fastRP
due to productization. - Default parameters for
gds.fastRP
have changed on the following configuration parameters:iterationWeights
now has default[0.0, 1.0, 1.0]
normalizeL2
has been removed and its effect is always applied
- Removed alpha procedures for GraphSage (replaced with
beta
tier, see New Features section)gds.alpha.graphSage.stream
gds.alpha.graphSage.write
- GraphSage no longer directly calculates embeddings, instead it has been split into
train
(to generate a named model) andwrite, mutate
, andstream
to apply the model predictions to your data. - Due to the creation of a
train
mode for graph sage, the following configuration parameters were moved:embeddingSize
- moved as configuration parameter ofgds.beta.graphSage.train
aggregator
- moved as configuration parameter ofgds.beta.graphSage.train
activationFunction
- moved as configuration parameter ofgds.beta.graphSage.train
sampleSizes
- moved as configuration parameter ofgds.beta.graphSage.train
nodePropertyNames
- moved as configuration parameter ofgds.beta.graphSage.train
tolerance
- moved as configuration parameter ofgds.beta.graphSage.train
learningRate
- moved as configuration parameter ofgds.beta.graphSage.train
epochs
- moved as configuration parameter ofgds.beta.graphSage.train
maxIterations
- moved as configuration parameter ofgds.beta.graphSage.train
searchDepth
- moved as configuration parameter ofgds.beta.graphSage.train
negativeSampleWeight
- moved as configuration parameter ofgds.beta.graphSage.train
degreeAsProperty
- moved as configuration parameter ofgds.beta.graphSage.train
gds.beta.graphSage.stream
procedure now requiresmodelName
configuration parameter.gds.beta.graphSage.write
procedure requiresmodelName
configuration parameter.- Removed
startLoss
andepochLosses
from the result columns ofgds.beta.graphSage.write
. - Added the graph create config as a return field to the train procedure, affecting
gds.beta.graphSage.train
- Fixed result column name
embeddings
toembedding
in GraphSAGE, to align with the other embeddings. - Removed configuration parameter
maxCost
fromgds.alpha.bfs/dfs
. - Unlocking the Enterprise Edition of the Graph Data Science library requires a license key. The previous config setting has been removed.
- Removed
degreeDistribution
fromgds.graph.drop
return columns. gds.pageRank
now respects the concurrency setting. It will not run if there is insufficient memory for the given concurrency setting.- Alpha similarity algorithms no longer accept graph name as a parameter. The algorithm never used the named graph, and now the possibility to specify one is removed.
New features
- Promote GraphSage to
beta
tier and added support for inductive models with thetrain
mode- This adds procedures
gds.beta.graphSage.mutate
gds.beta.graphSage.mutate.estimate
gds.beta.graphSage.stream
gds.beta.graphSage.stream.estimate
gds.beta.graphSage.train
gds.beta.graphSage.train.estimate
gds.beta.graphSage.write
gds.beta.graphSage.write.estimate
- And removes alpha procedures
gds.alpha.graphSage.stream
gds.alpha.graphSage.write
- This adds procedures
- GraphSage supports relationship weights, driven by
relationshipWeightProperty
- GraphSage supports node labels via
projectedFeatureSize
- Introduced the model catalog to manage trained models, including:
gds.beta.model.exists
- a procedure to check if a model exists in the catalogGds.beta.model.list
- list all available modelsgds.beta.model.drop
- removes a model from the catalog
- The Random Projection algorithm has been promoted to the product tier and we have added:
gds.fastRP.stats
gds.fastRP.mutate
gds.fastRP.estimate
- Added procedures for
stats
andmutate
mode, as well as,estimates
for all modes.
- FastRP has been extended to support relationship weights and directions
- FastRP supports integer configuration for iteration weights.
- We’ve added support for node property features for FastRP in the beta namespace with FastRPExtended:
gds.beta.fastRPExtended.mutate
gds.beta.fastRPExtended.stream
gds.beta.fastRPExtended.stats
gds.beta.fastRPExtended.write
gds.beta.fastRPExtended.mutate.estimate
gds.beta.fastRPExtended.stream.estimate
gds.beta.fastRPExtended.stats.estimate
gds.beta.fastRPExtended.write.estimate
- We’ve added the K-Nearest Neighbors (KNN) algorithm to the beta tier
gds.beta.knn.mutate
andgds.beta.knn.mutate.estimate
gds.beta.knn.stats
andgds.beta.knn.stats.estimate
gds.beta.knn.stream
andgds.beta.knn.stream.estimate
gds.beta.knn.write
andgds.beta.knn.write.estimate
- The in memory graph can now support list properties, enabling embedding results to be stored in memory, or loading embeddings from nodes for KNN or similarity calculations.
- Pregel framework
- Added Pregel annotation processor to generate GDS procedures for custom Pregel algorithms.
- Pregel now supports long and double array node values.
- Add support for composite node state to allow complex data types on nodes.
- Reduced memory consumption.
- Improved memory estimation.
- Simplified message iteration in
compute
methods. - Split context into Init- and ComputeContext and simplified API.
- Removed
K1ColoringExample
standalone project. - Added
pregel-bootstrap
standalone project. - Added
pregel-examples
module.
- Licensing: GDS Enterprise edition now requires license keys issued by Neo4j to unlock enterprise features
- Added
density
property to the output of graph ingraph.list
. - Added a
failIfMissing
flag togds.graph.drop
Bug fixes
- Pregel:
- Fixed a bug in Pregel that could lead to incorrect results when running in parallel.
- Fix cast exception when returning array node properties in generated Pregel procedures.
- Fixed a bug in a multi-source BFS traversal strategy that could affect the following procedures:
gds.alpha.closeness
gds.alpha.closeness.harmonic
gds.alpha.allShortestPaths
- Fixed a bug in
gds.alpha.shortestPath.deltaStepping
where large relationship weights led to incorrect results - Weakly connected components:
- Fixed a bug in WCC where
componentCount
would be negative when the graph is empty. - Fixed a regression where WCC could run more slowly with increased concurrency.
- Fixed a bug in WCC where
- Fixed bugs in Louvain:
-
communityCount
is no longer negative when the graph is empty. - changes to
maxIterations
are no longer ignored.
-
- Fixed a bug in LabelPropagation where
communityCount
would be negative when the graph is empty. - Fixed a bug in KNN where it failed when run on graphs with filtere...
GDS 1.4 Preview
Breaking changes
- Removed sparsity parameter from
gds.alpha.randomProjection.*
- Renamed
gds.alpha.randomProjection
togds.fastRP
due to productization. - Renamed
embeddingSize
parameter toembeddingDimension
for fastRP, GraphSAGE and Node2Vec. - Renamed
gds.alpha.randomProjection
togds.fastRP
due to productization. - Default parameters for
gds.fastRP
have changed on the following configuration parameters:iterationWeights
now has default[0.0, 1.0, 1.0]
normalizeL2
has been removed and its effect is always applied
- Removed alpha procedures for GraphSage (replaced with
beta
tier, see New Features section)gds.alpha.graphSage.stream
gds.alpha.graphSage.write
- GraphSage no longer directly calculates embeddings, instead it has been split into
train
(to generate a named model) andwrite, mutate
, andstream
to apply the model predictions to your data. - Due to the creation of a
train
mode for graph sage, the following configuration parameters were moved:embeddingSize
- moved as configuration parameter ofgds.beta.graphSage.train
aggregator
- moved as configuration parameter ofgds.beta.graphSage.train
activationFunction
- moved as configuration parameter ofgds.beta.graphSage.train
sampleSizes
- moved as configuration parameter ofgds.beta.graphSage.train
nodePropertyNames
- moved as configuration parameter ofgds.beta.graphSage.train
tolerance
- moved as configuration parameter ofgds.beta.graphSage.train
learningRate
- moved as configuration parameter ofgds.beta.graphSage.train
epochs
- moved as configuration parameter ofgds.beta.graphSage.train
maxIterations
- moved as configuration parameter ofgds.beta.graphSage.train
searchDepth
- moved as configuration parameter ofgds.beta.graphSage.train
negativeSampleWeight
- moved as configuration parameter ofgds.beta.graphSage.train
degreeAsProperty
- moved as configuration parameter ofgds.beta.graphSage.train
gds.beta.graphSage.stream
procedure now requiresmodelName
configuration parameter.gds.beta.graphSage.write
procedure requiresmodelName
configuration parameter.- Removed
startLoss
andepochLosses
from the result columns ofgds.beta.graphSage.write
. - Added the graph create config as a return field to the train procedure, affecting
gds.beta.graphSage.train
- Fixed result column name
embeddings
toembedding
in GraphSAGE, to align with the other embeddings. - Removed configuration parameter
maxCost
fromgds.alpha.bfs/dfs
. - Unlocking the Enterprise Edition of the Graph Data Science library requires a license key. The previous config setting has been removed.
- Removed
degreeDistribution
fromgds.graph.drop
return columns. gds.pageRank
now respects the concurrency setting. It will not run if there is insufficient memory for the given concurrency setting.- Alpha similarity algorithms no longer accept graph name as a parameter. The algorithm never used the named graph, and now the possibility to specify one is removed.
New features
- Promote GraphSage to
beta
tier and added support for inductive models with thetrain
mode- This adds procedures
gds.beta.graphSage.mutate
gds.beta.graphSage.mutate.estimate
gds.beta.graphSage.stream
gds.beta.graphSage.stream.estimate
gds.beta.graphSage.train
gds.beta.graphSage.train.estimate
gds.beta.graphSage.write
gds.beta.graphSage.write.estimate
- And removes alpha procedures
gds.alpha.graphSage.stream
gds.alpha.graphSage.write
- This adds procedures
- GraphSage supports relationship weights, driven by
relationshipWeightProperty
- GraphSage supports node labels via
projectedFeatureSize
- Introduced the model catalog to manage trained models, including:
gds.beta.model.exists
- a procedure to check if a model exists in the catalogGds.beta.model.list
- list all available modelsgds.beta.model.drop
- removes a model from the catalog
- The Random Projection algorithm has been promoted to the product tier and we have added:
gds.fastRP.stats
gds.fastRP.mutate
gds.fastRP.estimate
- Added procedures for
stats
andmutate
mode, as well as,estimates
for all modes.
- FastRP has been extended to support relationship weights and directions
- FastRP supports integer configuration for iteration weights.
- We’ve added support for node property features for FastRP in the beta namespace with FastRPExtended:
gds.beta.fastRPExtended.mutate
gds.beta.fastRPExtended.stream
gds.beta.fastRPExtended.stats
gds.beta.fastRPExtended.write
gds.beta.fastRPExtended.mutate.estimate
gds.beta.fastRPExtended.stream.estimate
gds.beta.fastRPExtended.stats.estimate
gds.beta.fastRPExtended.write.estimate
- We’ve added the K-Nearest Neighbors (KNN) algorithm to the beta tier
gds.beta.knn.mutate
andgds.beta.knn.mutate.estimate
gds.beta.knn.stats
andgds.beta.knn.stats.estimate
gds.beta.knn.stream
andgds.beta.knn.stream.estimate
gds.beta.knn.write
andgds.beta.knn.write.estimate
- The in memory graph can now support list properties, enabling embedding results to be stored in memory, or loading embeddings from nodes for KNN or similarity calculations.
- Pregel framework
- Added Pregel annotation processor to generate GDS procedures for custom Pregel algorithms.
- Pregel now supports long and double array node values.
- Add support for composite node state to allow complex data types on nodes.
- Reduced memory consumption.
- Improved memory estimation.
- Simplified message iteration in
compute
methods. - Split context into Init- and ComputeContext and simplified API.
- Removed
K1ColoringExample
standalone project. - Added
pregel-bootstrap
standalone project. - Added
pregel-examples
module.
- Licensing: GDS Enterprise edition now requires license keys issued by Neo4j to unlock enterprise features
- Added
density
property to the output of graph ingraph.list
. - Added a
failIfMissing
flag togds.graph.drop
Bug fixes
- Pregel:
- Fixed a bug in Pregel that could lead to incorrect results when running in parallel.
- Fix cast exception when returning array node properties in generated Pregel procedures.
- Fixed a bug in a multi-source BFS traversal strategy that could affect the following procedures:
gds.alpha.closeness
gds.alpha.closeness.harmonic
gds.alpha.allShortestPaths
- Weakly connected components:
- Fixed a bug in WCC where
componentCount
would be negative when the graph is empty. - Fixed a regression where WCC could run more slowly with increased concurrency.
- Fixed a bug in WCC where
- Fixed bugs in Louvain:
communityCount
is no longer negative when the graph is empty.- changes to
maxIterations
are no longer ignored.
- Fixed a bug in LabelPropagation where
communityCount
would be negative when the graph is empty. - Fixed a bug in
gds.graph.export
where at most one relationship property per relationship type would be exported. - Graph loading:
- Fixed a bug where using node label projections including properties on large graphs and high concurrency could lead to loss of some properties.
- Fixed bug in graph creation which could cause an AIOOB exception during node loading.
- The
readConcurrency
config parameter can no longer be overwritten by theconcurrency
param when it is explicitly set in an implicit graph creation config
- Fixed a bug in memory estimation of large anonymous fictitious graphs.
- Fixed bug in
gds.alpha.dfs/bfs
, where the algorithm did not terminate for graphs containing loops. - Fixed result column name
embeddings
toembedding
in GraphSAGE, to align with the other embeddings. - Fixed a bug in Node2Vec where many disconnected nodes would cause a StackOverflowError
- Fixed a bug in RandomProjection each iteration weight was multiplied all previous iteration weights.
- Similarity algorithms:
- Fixed a bug where Alpha Similarity algorithms would load a graph even though it was not needed
- Fixed a bug where similarity algorithms would not remove the placeholder graph if config validation fails on invalid user input.
- Fixed a bug where community statistic computation could overflow for large community ids.
- Fixed a bug where DegreeCentrality returned incorrect values when concurrency > 1.
- Fixed a bug where ClosenessCentrality was using a slightly incorrect formula for Wasserman-Faust algorithm.
- Fixed a bug that affected
gds.triangleCount()
andgds.alpha.triangles()
where not all triangles would be counted under certain conditions. - Parallel edges in a graph no longer lead to incorrect Local Clustering Coefficient and Triangle Count results.
Improvements
gds.fastRP
now accepts integer iterationWeights- If
graphSage.train
is run on a graph without relationships, GDS now fails gracefully with an appropriate error message - Added validation that properties used by GraphSage exist on graph
- Added validation for <code>embeddingSize</code>>=1
- Added a failIfExists flag to graph creation to enable a user to specify that if a graph already exists, it should be overwritten without failing.
- Progress logging:
- We now log progress in equally spaced percentages. This is 0-100% either in steps of 1, or in ...