Skip to content

Releases: neo4j/graph-data-science

GDS 1.6.0

27 May 21:19
Compare
Choose a tag to compare

Release Date: 27 May 2021

GDS 1.6 is compatible with Neo4j 4.0, 4.1, and 4.2 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.

Breaking changes

  • Degree centrality has been promoted to the product tier
    • Added procedures:
      • gds.degree.stream.estimate
      • gds.degree.write.estimate
      • gds.degree.mutate
      • gds.degree.mutate.estimate
      • gds.degree.stats
      • gds.degree.stats.estimate
    • Removed alpha procedures:
      • gds.alpha.degree.stream
      • Gds.alpha.degree.write
  • Article Rank has been promoted to the product tier
    • Added procedures:
      • gds.articleRank.stream
      • gds.articleRank.stream.estimate
      • gds.articleRank.write
      • gds.articleRank.write.estimate
      • gds.articleRank.mutate
      • gds.articleRank.mutate.estimate
      • gds.articleRank.stats
      • gds.articleRank.stats.estimate
    • Removed alpha procedures:
      • gds.alpha.articleRank.stream
      • gds.alpha.articleRank.write
  • Eigenvector Centrality has been promoted to the product tier
    • Added procedures:
      • gds.eigenvector.stream
      • gds.eigenvector.stream.estimate
      • gds.eigenvector.write
      • gds.eigenvector.write.estimate
      • gds.eigenvector.mutate
      • gds.eigenvector.mutate.estimate
      • gds.eigenvector.stats
      • gds.eigenvector.stats.estimate
    • Removed alpha procedures:
      • gds.alpha.eigenvector.stream
      • Gds.alpha.eigenvector.write
  • AStar has been promoted to the product tier
    • Added procedures:
      • gds.astar.stream
      • gds.astar.stream.estimate
      • gds.astar.write
      • gds.astar.write.estimate
      • gds.astar.mutate
      • gds.astar.mutate.estimate
    • Removed alpha procedures:
      • gds.beta.astar.stream
      • gds.beta.astar.stream.estimate
      • gds.beta.astar.write
      • gds.beta.astar.write.estimate
      • gds.beta.astar.mutate
      • gds.beta.astar.mutate.estimate
    • The parameter path was removed. The path computation is controlled by the YIELD.
  • Yens K Shortest Paths has been promoted to the product tier:
    • Added procedures:
      • gds.yens.stream
      • gds.yens.stream.estimate
      • gds.yens.write
      • gds.yens.write.estimate
      • gds.yens.mutate
      • gds.yens.mutate.estimate
    • Removed alpha procedures:
      • gds.beta.yens.stream
      • gds.beta.yens.stream.estimate
      • gds.beta.yens.write
      • gds.beta.yens.write.estimate
      • gds.beta.yens.mutate
      • gds.beta.yens.mutate.estimate
    • The parameter path was removed. The path computation is controlled by the cypher YIELD sub-clause.
  • Dijkstra Source-Target has been promoted to the product tier:
    • Added procedures:
      • gds.shortestPath.dijkstra.stream
      • gds.shortestPath.dijkstra.stream.estimate
      • gds.shortestPath.dijkstra.write
      • gds.shortestPath.dijkstra.write.estimate
      • gds.shortestPath.dijkstra.mutate
      • gds.shortestPath.dijkstra.mutate.estimate
    • Removed alpha procedures:
      • gds.beta.shortestPath.dijkstra.stream
      • gds.beta.shortestPath.dijkstra.stream.estimate
      • gds.beta.shortestPath.dijkstra.write
      • gds.beta.shortestPath.dijkstra.write.estimate
      • gds.beta.shortestPath.dijkstra.mutate
      • gds.beta.shortestPath.dijkstra.mutate.estimate
    • The parameter path was removed. The path computation is controlled by the cypher YIELD sub-clause.
  • Dijkstra Single-Source has been promoted to the product tier:
    • Added procedures:
      • gds.allShortestPath.dijkstra.stream
      • gds.allShortestPath.dijkstra.stream.estimate
      • gds.allShortestPath.dijkstra.write
      • gds.allShortestPath.dijkstra.write.estimate
      • gds.allShortestPath.dijkstra.mutate
      • gds.allShortestPath.dijkstra.mutate.estimate
    • Removed alpha procedures:
      • gds.beta.allShortestPath.dijkstra.stream
      • gds.beta.allShortestPath.dijkstra.stream.estimate
      • gds.beta.allShortestPath.dijkstra.write
      • gds.beta.allShortestPath.dijkstra.write.estimate
      • gds.beta.allShortestPath.dijkstra.mutate
      • gds.beta.allShortestPath.dijkstra.mutate.estimate
    • The parameter path was removed. The path computation is controlled by the cypher YIELD sub-clause.
  • Node2Vec has been promoted to the beta tier
    • Added procedures:
      • gds.beta.node2vec.stream
      • gds.beta.node2vec.stream.estimate
      • gds.beta.node2vec.write
      • gds.beta.node2vec.write.estimate
      • gds.beta.node2vec.mutate
      • gds.beta.node2vec.mutate.estimate
    • Removed alpha procedures:
      • gds.alpha.node2vec.stream
      • gds.alpha.node2vec.write
  • The parameter centerSamplingFactor is renamed to positiveSamplingFactor
  • The parameter contextSamplingExponent is renamed to negativeSamplingExponent
  • The model catalog list feature no longer throws an error when a non-existent model name is given
  • Node Classification and Link Prediction”
    • maxStreakCount configuration parameter is renamed to patience. It is used in the train modes of Node Classification and Link Prediction.
    • maxIterations and minIterations configuration parameters are renamed to maxEpochs and minEpochs. It is used in the train modes of Node Classification and Link Prediction.
    • windowSize configuration parameters is removed from the train modes of Node Classification and Link Prediction.
  • gds.alpha.ml.linkPrediction.train configuration parameter classRatio is renamed to negativeClassWeight. It is also mandatory now.
  • Removed degreeAsProperty configuration parameter from GraphSAGE
    • The same effect can be achieved by using gds.degree.mutate and use the mutated property as feature for GraphSAGE training.
    • Important: GraphSAGE models persisted with earlier versions of GDS are not compatible with this version.

    New features

    • New ScaleProperties procedures to transform and scale node properties. Available scalers: Min-max, Max, Mean, Log, Standard Score, L1 Norm, L2 Norm
      • gds.alpha.scaleProperties.stream
      • gds.alpha.scaleProperties.mutate
    • Added ability to create new in-memory graphs by filtering existing named graphs based on node and relationship properties with new catalog procedure gds.beta.graph.create.subgraph
    • Two new centrality algorithms for influence maximization were contributed by community member @xkitsios
      • gds.alpha.influenceMaximization.celf.stream
      • gds.alpha.influenceMaximization.greedy.stream
    • Link Prediction:
      • Added support for storing, loading and publishing Link Prediction models.
      • Added progress logging for gds.alpha.ml.linkPrediction.train and gds.alpha.ml.linkPrediction.predict.
      • Added write and stream modes to gds.alpha.ml.linkPrediction.predict
        • gds.alpha.ml.linkPrediction.stream
        • gds.alpha.ml.linkPrediction.write
      • Added estimate mode for Link Prediction:
        • gds.alpha.ml.linkPrediction.train.estimate
        • gds.alpha.ml.lin...
  • Read more

    GDS 1.6 Preview

    20 May 22:23
    Compare
    Choose a tag to compare
    GDS 1.6 Preview Pre-release
    Pre-release

    Release Date: 20 May 2021

    GDS 1.6 is compatible with Neo4j 4.0, 4.1, and 4.2 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.

    Breaking changes

    • Degree centrality has been promoted to the product tier
      • Added procedures:
        • gds.degree.stream.estimate
        • gds.degree.write.estimate
        • gds.degree.mutate
        • gds.degree.mutate.estimate
        • gds.degree.stats
        • gds.degree.stats.estimate
      • Removed alpha procedures:
        • gds.alpha.degree.stream
        • Gds.alpha.degree.write
    • Article Rank has been promoted to the product tier
      • Added procedures:
        • gds.articleRank.stream
        • gds.articleRank.stream.estimate
        • gds.articleRank.write
        • gds.articleRank.write.estimate
        • gds.articleRank.mutate
        • gds.articleRank.mutate.estimate
        • gds.articleRank.stats
        • gds.articleRank.stats.estimate
      • Removed alpha procedures:
        • gds.alpha.articleRank.stream
        • gds.alpha.articleRank.write
    • Eigenvector Centrality has been promoted to the product tier
      • Added procedures:
        • gds.eigenvector.stream
        • gds.eigenvector.stream.estimate
        • gds.eigenvector.write
        • gds.eigenvector.write.estimate
        • gds.eigenvector.mutate
        • gds.eigenvector.mutate.estimate
        • gds.eigenvector.stats
        • gds.eigenvector.stats.estimate
      • Removed alpha procedures:
        • gds.alpha.eigenvector.stream
        • Gds.alpha.eigenvector.write
    • AStar has been promoted to the product tier
      • Added procedures:
        • gds.astar.stream
        • gds.astar.stream.estimate
        • gds.astar.write
        • gds.astar.write.estimate
        • gds.astar.mutate
        • gds.astar.mutate.estimate
      • Removed alpha procedures:
        • gds.beta.astar.stream
        • gds.beta.astar.stream.estimate
        • gds.beta.astar.write
        • gds.beta.astar.write.estimate
        • gds.beta.astar.mutate
        • gds.beta.astar.mutate.estimate
      • The parameter path was removed. The path computation is controlled by the YIELD.
    • Yens K Shortest Paths has been promoted to the product tier:
      • Added procedures:
        • gds.yens.stream
        • gds.yens.stream.estimate
        • gds.yens.write
        • gds.yens.write.estimate
        • gds.yens.mutate
        • gds.yens.mutate.estimate
      • Removed alpha procedures:
        • gds.beta.yens.stream
        • gds.beta.yens.stream.estimate
        • gds.beta.yens.write
        • gds.beta.yens.write.estimate
        • gds.beta.yens.mutate
        • gds.beta.yens.mutate.estimate
      • The parameter path was removed. The path computation is controlled by the cypher YIELD sub-clause.
    • Dijkstra Source-Target has been promoted to the product tier:
      • Added procedures:
        • gds.shortestPath.dijkstra.stream
        • gds.shortestPath.dijkstra.stream.estimate
        • gds.shortestPath.dijkstra.write
        • gds.shortestPath.dijkstra.write.estimate
        • gds.shortestPath.dijkstra.mutate
        • gds.shortestPath.dijkstra.mutate.estimate
      • Removed alpha procedures:
        • gds.beta.shortestPath.dijkstra.stream
        • gds.beta.shortestPath.dijkstra.stream.estimate
        • gds.beta.shortestPath.dijkstra.write
        • gds.beta.shortestPath.dijkstra.write.estimate
        • gds.beta.shortestPath.dijkstra.mutate
        • gds.beta.shortestPath.dijkstra.mutate.estimate
      • The parameter path was removed. The path computation is controlled by the cypher YIELD sub-clause.
    • Dijkstra Single-Source has been promoted to the product tier:
      • Added procedures:
        • gds.allShortestPath.dijkstra.stream
        • gds.allShortestPath.dijkstra.stream.estimate
        • gds.allShortestPath.dijkstra.write
        • gds.allShortestPath.dijkstra.write.estimate
        • gds.allShortestPath.dijkstra.mutate
        • gds.allShortestPath.dijkstra.mutate.estimate
      • Removed alpha procedures:
        • gds.beta.allShortestPath.dijkstra.stream
        • gds.beta.allShortestPath.dijkstra.stream.estimate
        • gds.beta.allShortestPath.dijkstra.write
        • gds.beta.allShortestPath.dijkstra.write.estimate
        • gds.beta.allShortestPath.dijkstra.mutate
        • gds.beta.allShortestPath.dijkstra.mutate.estimate
      • The parameter path was removed. The path computation is controlled by the cypher YIELD sub-clause.
    • Node2Vec has been promoted to the beta tier
      • Added procedures:
        • gds.beta.node2vec.stream
        • gds.beta.node2vec.stream.estimate
        • gds.beta.node2vec.write
        • gds.beta.node2vec.write.estimate
        • gds.beta.node2vec.mutate
        • gds.beta.node2vec.mutate.estimate
      • Removed alpha procedures:
        • gds.alpha.node2vec.stream
        • gds.alpha.node2vec.write
    • The parameter centerSamplingFactor is renamed to positiveSamplingFactor
    • The parameter contextSamplingExponent is renamed to negativeSamplingExponent
  • The model catalog list feature no longer throws an error when a non-existent model name is given
  • Node Classification and Link Prediction”
    • maxStreakCount configuration parameter is renamed to patience. It is used in the train modes of Node Classification and Link Prediction.
    • maxIterations and minIterations configuration parameters are renamed to maxEpochs and minEpochs. It is used in the train modes of Node Classification and Link Prediction.
    • windowSize configuration parameters is removed from the train modes of Node Classification and Link Prediction.
  • gds.alpha.ml.linkPrediction.train configuration parameter classRatio is renamed to negativeClassWeight. It is also mandatory now.
  • Removed degreeAsProperty configuration parameter from GraphSAGE
    • The same effect can be achieved by using gds.degree.mutate and use the mutated property as feature for GraphSAGE training.
    • Important: GraphSAGE models persisted with earlier versions of GDS are not compatible with this version.

    New features

    • New ScaleProperties procedures to transform and scale node properties. Available scalers: Min-max, Max, Mean, Log, Standard Score, L1 Norm, L2 Norm
      • gds.alpha.scaleProperties.stream
      • gds.alpha.scaleProperties.mutate
    • Added ability to create new in-memory graphs by filtering existing named graphs based on node and relationship properties with new catalog procedure gds.beta.graph.create.subgraph
    • Two new centrality algorithms for influence maximization were contributed by community member @xkitsios
      • gds.alpha.influenceMaximization.celf.stream
      • gds.alpha.influenceMaximization.greedy.stream
    • Link Prediction:
      • Added support for storing, loading and publishing Link Prediction models.
      • Added progress logging for gds.alpha.ml.linkPrediction.train and gds.alpha.ml.linkPrediction.predict.
      • Added write and stream modes to gds.alpha.ml.linkPrediction.predict
        • gds.alpha.ml.linkPrediction.stream
        • gds.alpha.ml.linkPrediction.write
      • Added estimate mode for Link Prediction:
        • gds.alpha.ml.linkPrediction.train.estimate
        • gds.alpha.ml.lin...
  • Read more

    1.5.2

    11 May 17:48
    Compare
    Choose a tag to compare

    Release Date: 11 May 2021

    GDS 1.5 is compatible with Neo4j 4.0, 4.1, and 4.2 but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.

    Bug fixes

    • Fixed a bug in FastRPExtended concerning implementation internals, especially when propertyDimesion == embeddingDimension output contained NaNs.
    • Fixed a bug where Alpha similarity algorithms in some cases could fail on division by 0 when writing results back.
    • Fixed an issue where gds.graph.drop could take a long time when the graph contained node embeddings.
    • Fixed a bug where gds.beta.graphSage.train was failing in the presence of array properties.

    1.5.1

    19 Mar 20:41
    Compare
    Choose a tag to compare

    Release Date: 3 March, 2021

    GDS 1.5 is compatible with Neo4j 4.0 and 4.1, but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.

    Bug fixes

    • Fixed a bug which caused gds.graph.list and gds.graph.drop to throw an error when specifying a graph with duplicate property keys by failing early.
    • Fixed potential ArrayIndexOutOfBoundsException when running gds.triangleCount on a relationship-filtered graph.
    • Fixed a bug that can lead to inconsistencies when writing or mutating new relationships created from a label-filtered graph.

    Improvements

    • Progress logging: Removed a "disabled" log message from the database startup when GDS was running in its default configuration. It is replaced with a more elaborate "enabled" message when the progress tracking feature is enabled.
    • We now return the name of the current database in the error message if graph is not found.

    1.5.0

    09 Feb 22:35
    Compare
    Choose a tag to compare

    Release Date: 9 February, 2021

    GDS 1.5 is compatible with Neo4j 4.0 and 4.1, but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.

    Breaking changes

    • Promote several shortest path algorithms to beta tier: Dijkstra, A*, and Yens k-shortest paths. The APIs have been standardized, and all include the ability to return source/target nodes, nodes traversed, and paths.
      • This adds procedures
        • gds.beta.shortestPath.dijkstra.mutate
        • gds.beta.shortestPath.dijkstra.mutate.estimate
        • gds.beta.shortestPath.dijkstra.stream
        • gds.beta.shortestPath.dijkstra.stream.estimate
        • gds.beta.shortestPath.dijkstra.write
        • gds.beta.shortestPath.dijkstra.write.estimate
        • gds.beta.shortestPath.astar.mutate
        • gds.beta.shortestPath.astar.mutate.estimate
        • gds.beta.shortestPath.astar.stream
        • gds.beta.shortestPath.astar.stream.estimate
        • gds.beta.shortestPath.astar.write
        • gds.beta.shortestPath.astar.write.estimate
        • gds.beta.shortestPath.yens.mutate
        • gds.beta.shortestPath.yens.mutate.estimate
        • gds.beta.shortestPath.yens.stream
        • gds.beta.shortestPath.yens.stream.estimate
        • gds.beta.shortestPath.yens.write
        • gds.beta.shortestPath.yens.write.estimate
        • gds.beta.allShortestPaths.dijkstra.mutate
        • gds.beta.allShortestPaths.dijkstra.mutate.estimate
        • gds.beta.allShortestPaths.dijkstra.stream
        • gds.beta.allShortestPaths.dijkstra.stream.estimate
        • gds.beta.allShortestPaths.dijkstra.write
        • gds.beta.allShortestPaths.dijkstra.write.estimate
      • And removes alpha procedures
        • gds.alpha.shortestPath.stream
        • gds.alpha.shortestPath.write
        • gds.alpha.shortestPath.astar.stream
        • gds.alpha.kShortestPaths.stream
        • gds.alpha.kShortestPaths.write
        • gds.alpha.shortestPaths.stream
        • gds.alpha.shortestPaths.write
    • GDS will now throw an error when a user tries to use a mutate procedure on graphs not stored in the graph catalog (anonymous graphs)

    New Features

    • Introduced machine learning based multi-class node classification procedures:
      • Add gds.alpha.ml.nodeClassification.train to train a model to predict a node label
      • Add gds.alpha.ml.nodeClassification.predict.mutate to make predictions using a trained model
    • Introduced machine learning based link prediction procedures:
      • Add gds.alpha.linkPrediction.train procedure for training Link Prediction models.
      • Added gds.alpha.linkPrediction.predict.mutate procedure for predicting relationships based on a trained Link Prediction model.
    • Added support for list properties as features for
      • gds.alpha.nodeClassification
      • gds.beta.fastRPExtended
      • gds.beta.graphSage
    • Added support for storing trained models on disk (Enterprise only)
      • gds.alpha.model.store
      • gds.alpha.model.load
      • gds.alpha.model.delete
    • Added procedure for publishing trained models (Enterprise only)
      • gds.alpha.model.publish
    • Added HITS algorithm to the alpha tier
      • gds.alpha.hits.mutate and gds.alpha.hits.mutate.estimate
      • gds.alpha.hits.stats and gds.alpha.hits.stats.estimate
      • gds.alpha.hits.stream and gds.alpha.hits.stream.estimate
      • gds.alpha.hits.write and gds.alpha.hits.write.estimate
    • Added Speaker-Listener Label Propagation Algorithm (SLLPA) to the alpha tier
      • gds.alpha.sllpa.mutate and gds.alpha.sllpa.mutate.estimate
      • gds.alpha.sllpa.stats and gds.alpha.sllpa.stats.estimate
      • gds.alpha.sllpa.stream and gds.alpha.sllpa.stream.estimate
      • gds.alpha.sllpa.write and gds.alpha.sllpa.write.estimate
    • Added CSV export capabilities with the gds.beta.graph.export.csv procedure to allow users to export their in-memory graph to CSV
    • Add message reducer capability to Pregel framework to improve memory consumption and computation runtime.
    • Added a progress logging procedure with gds.beta.listProgress, to return status of running algorithms. This is turned off by default, but can be enabled with gds.progress_tracking_enabled in the config.
    • Add a new BitIdMap data structure to represent node id mappings (Enterprise only)
      • The data structure can lead to a significant reduction in required heap space for an in-memory graph.
      • The data structure is used for native graph projections and in some algorithms, e.g., Louvain.
      • The data structure is not used in Cypher projections.
      • The feature is enabled by default on GDS Enterprise Edition and can be disabled using the USE_BIT_ID_MAP feature toggle.

    Bug fixes

    • Adding projection parameters as additional configuration in gds.graph.create and gds.graph.create.cypher will throw an exception if improperly configured, instead of being silently ignored.
    • Fixed a bug in gds.alpha.articleRank where centrality scores were not normalized correctly
    • Fixed a bug in path stream procedures where the path object (path: true) used incorrect node identifiers.
    • Fixed a bug in path write procedures where the relationship property nodeIds contained incorrect node identifiers.
    • Fixed a race condition that could cause exceptions thrown by scheduled tasks to be supressed.

    Improvements

    • Improved progress logging to write progress per individual node label in gds.graph.writeNodeProperties.
    • When a named graph does not exist, the graph catalog will display similarly named stored graphs.
    • When a saved model does not exist, the model catalog will display similarly named stored graphs.
    • Added centralityDistribution to the return fields for the write mode of the alpha centrality algorithms.
    • gds.beta.graph.generate using relationshipDistribution: 'POWER_LAW' applies the distribution to the native orientation.
    • Added centralityDistribution as a return field in gds.betweenness.[write/mutate/stats]
    • Added getNeighbours and isMultiGraph to the Pregel-API.
    • Added new message queue implementations for the Pregel framework, which
      • replace the previously used JCTools queue and work with primitive double arrays instead of boxed values.
      • lead to 3x to 5x faster runtimes for Pregel based algorithms.
      • reduce GC pressure due to less object allocations which leads to more predictable runtimes.
      • support synchronous and asynchronous Pregel computations.

    Other Changes

    • The PageRank configuration parameter cacheWeights has been deprecated. The parameter had no effect.
    • Deprecated minimumScore, maximumScore, scoreSum return fields in gds.betweenness.[write/mutate/stats]

    GDS 1.5 Preview

    29 Jan 21:20
    Compare
    Choose a tag to compare
    GDS 1.5 Preview Pre-release
    Pre-release

    Release date: 29 January, 2021

    Warning: This is a preview release and not intended for production use. If you have any feedback, please let us know: https://github.com/neo4j/graph-data-science/issues

    GDS 1.5 is compatible with Neo4j 4.0 and 4.1, but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.

    Breaking changes

    • Promote several shortest path algorithms to beta tier: Dijkstra, A*, and Yens k-shortest paths. The APIs have been standardized, and all include the ability to return source/target nodes, nodes traversed, and paths.
      • This adds procedures
        • gds.beta.shortestPath.dijkstra.mutate
        • gds.beta.shortestPath.dijkstra.mutate.estimate
        • gds.beta.shortestPath.dijkstra.stream
        • gds.beta.shortestPath.dijkstra.stream.estimate
        • gds.beta.shortestPath.dijkstra.write
        • gds.beta.shortestPath.dijkstra.write.estimate
        • gds.beta.shortestPath.astar.mutate
        • gds.beta.shortestPath.astar.mutate.estimate
        • gds.beta.shortestPath.astar.stream
        • gds.beta.shortestPath.astar.stream.estimate
        • gds.beta.shortestPath.astar.write
        • gds.beta.shortestPath.astar.write.estimate
        • gds.beta.shortestPath.yens.mutate
        • gds.beta.shortestPath.yens.mutate.estimate
        • gds.beta.shortestPath.yens.stream
        • gds.beta.shortestPath.yens.stream.estimate
        • gds.beta.shortestPath.yens.write
        • gds.beta.shortestPath.yens.write.estimate
        • gds.beta.allShortestPaths.dijkstra.mutate
        • gds.beta.allShortestPaths.dijkstra.mutate.estimate
        • gds.beta.allShortestPaths.dijkstra.stream
        • gds.beta.allShortestPaths.dijkstra.stream.estimate
        • gds.beta.allShortestPaths.dijkstra.write
        • gds.beta.allShortestPaths.dijkstra.write.estimate
      • And removes alpha procedures
        • gds.alpha.shortestPath.stream
        • gds.alpha.shortestPath.write
        • gds.alpha.shortestPath.astar.stream
        • gds.alpha.kShortestPaths.stream
        • gds.alpha.kShortestPaths.write
        • gds.alpha.shortestPaths.stream
        • gds.alpha.shortestPaths.write
    • GDS will now throw an error when a user tries to use a mutate procedure on graphs not stored in the graph catalog (anonymous graphs)

    New Features

    • Introduced machine learning based multi-class node classification procedures:
      • Add gds.alpha.ml.nodeClassification.train to train a model to predict a node label
      • Add gds.alpha.ml.nodeClassification.predict.mutate to make predictions using a trained model
    • Introduced machine learning based link prediction procedures:
      • Add gds.alpha.linkPrediction.train procedure for training Link Prediction models.
      • Added gds.alpha.linkPrediction.predict.mutate procedure for predicting relationships based on a trained Link Prediction model.
    • Added support for list properties as features for
      • gds.alpha.nodeClassification
      • gds.beta.fastRPExtended
      • gds.beta.graphSage
    • Added support for storing trained models on disk (Enterprise only)
      • gds.alpha.model.store
      • gds.alpha.model.load
      • gds.alpha.model.delete
    • Added procedure for publishing trained models (Enterprise only)
      • Gds.alpha.model.publish
    • Added HITS algorithm to the alpha tier
      • gds.alpha.hits.mutate and gds.alpha.hits.mutate.estimate
      • gds.alpha.hits.stats and gds.alpha.hits.stats.estimate
      • gds.alpha.hits.stream and gds.alpha.hits.stream.estimate
      • gds.alpha.hits.write and gds.alpha.hits.write.estimate
    • Added Speaker-Listener Label Propagation Algorithm (SLLPA) to the alpha tier
      • gds.alpha.sllpa.mutate and gds.alpha.sllpa.mutate.estimate
      • gds.alpha.sllpa.stats and gds.alpha.sllpa.stats.estimate
      • gds.alpha.sllpa.stream and gds.alpha.sllpa.stream.estimate
      • gds.alpha.sllpa.write and gds.alpha.sllpa.write.estimate
    • Added CSV export capabilities with the gds.beta.graph.export.csv procedure to allow users to export their in-memory graph to CSV
    • Added a progress logging procedure with gds.beta.listProgress, to return status of running algorithms. This is turned off by default, but can be enabled with gds.progress_tracking_enabled in the config.
    • Add message reducer capability to Pregel framework to improve memory consumption and computation runtime.
    • Add a new BitIdMap data structure to represent node id mappings (Enterprise only)
      • The data structure can lead to a significant reduction in required heap space for an in-memory graph.
      • The data structure is used for native graph projections and in some algorithms, e.g., Louvain.
      • The data structure is not used in Cypher projections.
      • The feature is enabled by default on GDS Enterprise Edition and can be disabled using the USE_BIT_ID_MAP feature toggle.

    Bug fixes

    • Adding projection parameters as additional configuration in gds.graph.create and gds.graph.create.cypher will throw an exception if improperly configured, instead of being silently ignored.
    • Fixed a bug in gds.alpha.articleRank where centrality scores were not normalized correctly

    Improvements

    • Improved progress logging to write progress per individual node label in gds.graph.writeNodeProperties.
    • When a named graph does not exist, the graph catalog will display similarly named stored graphs.
    • When a saved model does not exist, the model catalog will display similarly named stored graphs.
    • Add centralityDistribution to the return fields for the write mode of the alpha centrality algorithms.
    • gds.beta.graph.generate using relationshipDistribution: 'POWER_LAW' applies the distribution to the native orientation.
    • Add getNeighbours and isMultiGraph to the Pregel-API.
    • Add centralityDistribution as a return field in gds.betweenness.[write/mutate/stats]

    Other Changes

    • The PageRank configuration parameter cacheWeights has been deprecated. The parameter had no effect.
    • Deprecate minimumScore, maximumScore, scoreSum return fields in gds.betweenness.[write/mutate/stats]

    GDS 1.4.1

    07 Dec 23:22
    Compare
    Choose a tag to compare

    Release date: 7 December, 2020

    GDS 1.4.1 is compatible with Neo4j 4.0 and 4.1, but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.

    Bug fixes

    • Fixed a bug in progress logging for gds.graph.writeNodeProperties() and gds.graph.writeRelationships() where some percentages were missed, or others reported multiple times.
    • Fixed a bug where gds.graph.writeNodeProperties() and gds.alpha.shortestPathDeltaStepping.write() were single threaded by default
    • Fixed a bug where gds.alpha.node2vec ignored relationships for graphs with multiple projected relationship types.
    • Fixed a bug where gds.pagerank.*.estimate would fail for very large node counts.
    • Fixed a bug where using float array node properties (e.g. after running gds.fastRP.mutate) would fail in some situations.
    • Fixed a bug where a graph with multiple labels and all nodes sharing at least one label could lead to either an exception or a wrongly mapped Neo4j id.

    Improvements

    • gds.pageRank will now select batches more dynamically to properly respect the requested concurrency.

    1.3.5

    02 Dec 17:00
    Compare
    Choose a tag to compare

    Release date: 23 November, 2020

    GDS 1.3.5 is compatible with Neo4j 4.0 and 4.1, but not Neo4j 3.5.x or 4.2. For a 3.5 compatible release, please see GDS 1.1.6. For a 4.2 compatible release, please see GDS 1.4.0

    See also 1.3.0 release notes, 1.3.1 release notes, 1.3.2 release notes, 1.3.3 release notes, and 1.3.4 release notes,

    Bug fixes

    • Fixed a bug in gds.graph.export where at most one relationship property per relationship type would be exported.
    • Fixed a bug in Louvain where changes to maxIterations were ignored.
    • Fixed a bug where gds.alpha.node2vec would ignore relationships for graphs with multiple projected relationship types.

    GDS 1.4.0

    05 Nov 20:16
    Compare
    Choose a tag to compare

    Release date: 5 November, 2020

    GDS 1.4.0 is compatible with Neo4j 4.0 and 4.1, but not Neo4j 3.5.x. For a 3.5 compatible release, please see GDS 1.1.6.

    Breaking changes

    • License key configuration was renamed from licenseFile to license_file for consistency with Bloom
    • Removed sparsity parameter from gds.alpha.randomProjection.*
    • Renamed gds.alpha.randomProjection to gds.fastRP due to productization.
    • Renamed embeddingSize parameter to embeddingDimension for fastRP, GraphSAGE and Node2Vec.
    • Renamed projectedFeatureSize to projectedFeatureDimension for GraphSAGE
    • Renamed nodePropertyNames has been renamed to featureProperties in gds.beta.fastRPExtended and gds.beta.graphSage.train
    • Renamed gds.alpha.randomProjection to gds.fastRP due to productization.
    • Default parameters for gds.fastRP have changed on the following configuration parameters:
      • iterationWeights now has default [0.0, 1.0, 1.0]
      • normalizeL2 has been removed and its effect is always applied
    • Removed alpha procedures for GraphSage (replaced with beta tier, see New Features section)
      • gds.alpha.graphSage.stream
      • gds.alpha.graphSage.write
    • GraphSage no longer directly calculates embeddings, instead it has been split into train (to generate a named model) and write, mutate, and stream to apply the model predictions to your data.
    • Due to the creation of a train mode for graph sage, the following configuration parameters were moved:
      • embeddingSize - moved as configuration parameter of gds.beta.graphSage.train
      • aggregator - moved as configuration parameter of gds.beta.graphSage.train
      • activationFunction - moved as configuration parameter of gds.beta.graphSage.train
      • sampleSizes - moved as configuration parameter of gds.beta.graphSage.train
      • nodePropertyNames - moved as configuration parameter of gds.beta.graphSage.train
      • tolerance - moved as configuration parameter of gds.beta.graphSage.train
      • learningRate - moved as configuration parameter of gds.beta.graphSage.train
      • epochs - moved as configuration parameter of gds.beta.graphSage.train
      • maxIterations - moved as configuration parameter of gds.beta.graphSage.train
      • searchDepth - moved as configuration parameter of gds.beta.graphSage.train
      • negativeSampleWeight - moved as configuration parameter of gds.beta.graphSage.train
      • degreeAsProperty - moved as configuration parameter of gds.beta.graphSage.train
    • gds.beta.graphSage.stream procedure now requires modelName configuration parameter.
    • gds.beta.graphSage.write procedure requires modelName configuration parameter.
    • Removed startLoss and epochLosses from the result columns of gds.beta.graphSage.write.
    • Added the graph create config as a return field to the train procedure, affecting gds.beta.graphSage.train
    • Fixed result column name embeddings to embedding in GraphSAGE, to align with the other embeddings.
    • Removed configuration parameter maxCost from gds.alpha.bfs/dfs.
    • Unlocking the Enterprise Edition of the Graph Data Science library requires a license key. The previous config setting has been removed.
    • Removed degreeDistribution from gds.graph.drop return columns.
    • gds.pageRank now respects the concurrency setting. It will not run if there is insufficient memory for the given concurrency setting.
    • Alpha similarity algorithms no longer accept graph name as a parameter. The algorithm never used the named graph, and now the possibility to specify one is removed.

    New features

    • Promote GraphSage to beta tier and added support for inductive models with the train mode
      • This adds procedures
        • gds.beta.graphSage.mutate
        • gds.beta.graphSage.mutate.estimate
        • gds.beta.graphSage.stream
        • gds.beta.graphSage.stream.estimate
        • gds.beta.graphSage.train
        • gds.beta.graphSage.train.estimate
        • gds.beta.graphSage.write
        • gds.beta.graphSage.write.estimate
      • And removes alpha procedures
        • gds.alpha.graphSage.stream
        • gds.alpha.graphSage.write
    • GraphSage supports relationship weights, driven by relationshipWeightProperty
    • GraphSage supports node labels via projectedFeatureSize
    • Introduced the model catalog to manage trained models, including:
      • gds.beta.model.exists - a procedure to check if a model exists in the catalog
      • Gds.beta.model.list- list all available models
      • gds.beta.model.drop - removes a model from the catalog
    • The Random Projection algorithm has been promoted to the product tier and we have added:
      • gds.fastRP.stats
      • gds.fastRP.mutate
      • gds.fastRP.estimate
      • Added procedures for stats and mutate mode, as well as, estimates for all modes.
    • FastRP has been extended to support relationship weights and directions
    • FastRP supports integer configuration for iteration weights.
    • We’ve added support for node property features for FastRP in the beta namespace with FastRPExtended:
      • gds.beta.fastRPExtended.mutate
      • gds.beta.fastRPExtended.stream
      • gds.beta.fastRPExtended.stats
      • gds.beta.fastRPExtended.write
      • gds.beta.fastRPExtended.mutate.estimate
      • gds.beta.fastRPExtended.stream.estimate
      • gds.beta.fastRPExtended.stats.estimate
      • gds.beta.fastRPExtended.write.estimate
    • We’ve added the K-Nearest Neighbors (KNN) algorithm to the beta tier
    • gds.beta.knn.mutate and gds.beta.knn.mutate.estimate
    • gds.beta.knn.stats and gds.beta.knn.stats.estimate
    • gds.beta.knn.stream and gds.beta.knn.stream.estimate
    • gds.beta.knn.write and gds.beta.knn.write.estimate
    • The in memory graph can now support list properties, enabling embedding results to be stored in memory, or loading embeddings from nodes for KNN or similarity calculations.
    • Pregel framework
      • Added Pregel annotation processor to generate GDS procedures for custom Pregel algorithms.
      • Pregel now supports long and double array node values.
      • Add support for composite node state to allow complex data types on nodes.
      • Reduced memory consumption.
      • Improved memory estimation.
      • Simplified message iteration in compute methods.
      • Split context into Init- and ComputeContext and simplified API.
      • Removed K1ColoringExample standalone project.
      • Added pregel-bootstrap standalone project.
      • Added pregel-examples module.
    • Licensing: GDS Enterprise edition now requires license keys issued by Neo4j to unlock enterprise features
    • Added density property to the output of graph in graph.list.
    • Added a failIfMissing flag to gds.graph.drop

    Bug fixes

    • Pregel:
      • Fixed a bug in Pregel that could lead to incorrect results when running in parallel.
      • Fix cast exception when returning array node properties in generated Pregel procedures.
    • Fixed a bug in a multi-source BFS traversal strategy that could affect the following procedures:
      • gds.alpha.closeness
      • gds.alpha.closeness.harmonic
      • gds.alpha.allShortestPaths
    • Fixed a bug in gds.alpha.shortestPath.deltaStepping where large relationship weights led to incorrect results
    • Weakly connected components:
      • Fixed a bug in WCC where componentCount would be negative when the graph is empty.
      • Fixed a regression where WCC could run more slowly with increased concurrency.
    • Fixed bugs in Louvain:
      • communityCount is no longer negative when the graph is empty.
      • changes to maxIterations are no longer ignored.
    • Fixed a bug in LabelPropagation where communityCount would be negative when the graph is empty.
    • Fixed a bug in KNN where it failed when run on graphs with filtere...
    Read more

    GDS 1.4 Preview

    16 Oct 20:03
    Compare
    Choose a tag to compare
    GDS 1.4 Preview Pre-release
    Pre-release

    Breaking changes

    • Removed sparsity parameter from gds.alpha.randomProjection.*
    • Renamed gds.alpha.randomProjection to gds.fastRP due to productization.
    • Renamed embeddingSize parameter to embeddingDimension for fastRP, GraphSAGE and Node2Vec.
    • Renamed gds.alpha.randomProjection to gds.fastRP due to productization.
    • Default parameters for gds.fastRP have changed on the following configuration parameters:
      • iterationWeights now has default [0.0, 1.0, 1.0]
      • normalizeL2 has been removed and its effect is always applied
    • Removed alpha procedures for GraphSage (replaced with beta tier, see New Features section)
      • gds.alpha.graphSage.stream
      • gds.alpha.graphSage.write
    • GraphSage no longer directly calculates embeddings, instead it has been split into train (to generate a named model) and write, mutate, and stream to apply the model predictions to your data.
    • Due to the creation of a train mode for graph sage, the following configuration parameters were moved:
      • embeddingSize - moved as configuration parameter of gds.beta.graphSage.train
      • aggregator - moved as configuration parameter of gds.beta.graphSage.train
      • activationFunction - moved as configuration parameter of gds.beta.graphSage.train
      • sampleSizes - moved as configuration parameter of gds.beta.graphSage.train
      • nodePropertyNames - moved as configuration parameter of gds.beta.graphSage.train
      • tolerance - moved as configuration parameter of gds.beta.graphSage.train
      • learningRate - moved as configuration parameter of gds.beta.graphSage.train
      • epochs - moved as configuration parameter of gds.beta.graphSage.train
      • maxIterations - moved as configuration parameter of gds.beta.graphSage.train
      • searchDepth - moved as configuration parameter of gds.beta.graphSage.train
      • negativeSampleWeight - moved as configuration parameter of gds.beta.graphSage.train
      • degreeAsProperty - moved as configuration parameter of gds.beta.graphSage.train
    • gds.beta.graphSage.stream procedure now requires modelName configuration parameter.
    • gds.beta.graphSage.write procedure requires modelName configuration parameter.
    • Removed startLoss and epochLosses from the result columns of gds.beta.graphSage.write.
    • Added the graph create config as a return field to the train procedure, affecting gds.beta.graphSage.train
    • Fixed result column name embeddings to embedding in GraphSAGE, to align with the other embeddings.
    • Removed configuration parameter maxCost from gds.alpha.bfs/dfs.
    • Unlocking the Enterprise Edition of the Graph Data Science library requires a license key. The previous config setting has been removed.
    • Removed degreeDistribution from gds.graph.drop return columns.
    • gds.pageRank now respects the concurrency setting. It will not run if there is insufficient memory for the given concurrency setting.
    • Alpha similarity algorithms no longer accept graph name as a parameter. The algorithm never used the named graph, and now the possibility to specify one is removed.

    New features

    • Promote GraphSage to beta tier and added support for inductive models with the train mode
      • This adds procedures
        • gds.beta.graphSage.mutate
        • gds.beta.graphSage.mutate.estimate
        • gds.beta.graphSage.stream
        • gds.beta.graphSage.stream.estimate
        • gds.beta.graphSage.train
        • gds.beta.graphSage.train.estimate
        • gds.beta.graphSage.write
        • gds.beta.graphSage.write.estimate
      • And removes alpha procedures
        • gds.alpha.graphSage.stream
        • gds.alpha.graphSage.write
    • GraphSage supports relationship weights, driven by relationshipWeightProperty
    • GraphSage supports node labels via projectedFeatureSize
    • Introduced the model catalog to manage trained models, including:
      • gds.beta.model.exists - a procedure to check if a model exists in the catalog
      • Gds.beta.model.list- list all available models
      • gds.beta.model.drop - removes a model from the catalog
    • The Random Projection algorithm has been promoted to the product tier and we have added:
      • gds.fastRP.stats
      • gds.fastRP.mutate
      • gds.fastRP.estimate
      • Added procedures for stats and mutate mode, as well as, estimates for all modes.
    • FastRP has been extended to support relationship weights and directions
    • FastRP supports integer configuration for iteration weights.
    • We’ve added support for node property features for FastRP in the beta namespace with FastRPExtended:
      • gds.beta.fastRPExtended.mutate
      • gds.beta.fastRPExtended.stream
      • gds.beta.fastRPExtended.stats
      • gds.beta.fastRPExtended.write
      • gds.beta.fastRPExtended.mutate.estimate
      • gds.beta.fastRPExtended.stream.estimate
      • gds.beta.fastRPExtended.stats.estimate
      • gds.beta.fastRPExtended.write.estimate
    • We’ve added the K-Nearest Neighbors (KNN) algorithm to the beta tier
    • gds.beta.knn.mutate and gds.beta.knn.mutate.estimate
    • gds.beta.knn.stats and gds.beta.knn.stats.estimate
    • gds.beta.knn.stream and gds.beta.knn.stream.estimate
    • gds.beta.knn.write and gds.beta.knn.write.estimate
    • The in memory graph can now support list properties, enabling embedding results to be stored in memory, or loading embeddings from nodes for KNN or similarity calculations.
    • Pregel framework
      • Added Pregel annotation processor to generate GDS procedures for custom Pregel algorithms.
      • Pregel now supports long and double array node values.
      • Add support for composite node state to allow complex data types on nodes.
      • Reduced memory consumption.
      • Improved memory estimation.
      • Simplified message iteration in compute methods.
      • Split context into Init- and ComputeContext and simplified API.
      • Removed K1ColoringExample standalone project.
      • Added pregel-bootstrap standalone project.
      • Added pregel-examples module.
    • Licensing: GDS Enterprise edition now requires license keys issued by Neo4j to unlock enterprise features
    • Added density property to the output of graph in graph.list.
    • Added a failIfMissing flag to gds.graph.drop

    Bug fixes

    • Pregel:
      • Fixed a bug in Pregel that could lead to incorrect results when running in parallel.
      • Fix cast exception when returning array node properties in generated Pregel procedures.
    • Fixed a bug in a multi-source BFS traversal strategy that could affect the following procedures:
      • gds.alpha.closeness
      • gds.alpha.closeness.harmonic
      • gds.alpha.allShortestPaths
    • Weakly connected components:
      • Fixed a bug in WCC where componentCount would be negative when the graph is empty.
      • Fixed a regression where WCC could run more slowly with increased concurrency.
    • Fixed bugs in Louvain:
      • communityCount is no longer negative when the graph is empty.
      • changes to maxIterations are no longer ignored.
    • Fixed a bug in LabelPropagation where communityCount would be negative when the graph is empty.
    • Fixed a bug in gds.graph.export where at most one relationship property per relationship type would be exported.
    • Graph loading:
      • Fixed a bug where using node label projections including properties on large graphs and high concurrency could lead to loss of some properties.
      • Fixed bug in graph creation which could cause an AIOOB exception during node loading.
      • The readConcurrency config parameter can no longer be overwritten by the concurrency param when it is explicitly set in an implicit graph creation config
    • Fixed a bug in memory estimation of large anonymous fictitious graphs.
    • Fixed bug in gds.alpha.dfs/bfs, where the algorithm did not terminate for graphs containing loops.
    • Fixed result column name embeddings to embedding in GraphSAGE, to align with the other embeddings.
    • Fixed a bug in Node2Vec where many disconnected nodes would cause a StackOverflowError
    • Fixed a bug in RandomProjection each iteration weight was multiplied all previous iteration weights.
    • Similarity algorithms:
      • Fixed a bug where Alpha Similarity algorithms would load a graph even though it was not needed
      • Fixed a bug where similarity algorithms would not remove the placeholder graph if config validation fails on invalid user input.
    • Fixed a bug where community statistic computation could overflow for large community ids.
    • Fixed a bug where DegreeCentrality returned incorrect values when concurrency > 1.
    • Fixed a bug where ClosenessCentrality was using a slightly incorrect formula for Wasserman-Faust algorithm.
    • Fixed a bug that affected gds.triangleCount() and gds.alpha.triangles() where not all triangles would be counted under certain conditions.
    • Parallel edges in a graph no longer lead to incorrect Local Clustering Coefficient and Triangle Count results.

    Improvements

    • gds.fastRP now accepts integer iterationWeights
    • If graphSage.train is run on a graph without relationships, GDS now fails gracefully with an appropriate error message
    • Added validation that properties used by GraphSage exist on graph
    • Added validation for <code>embeddingSize</code>>=1
    • Added a failIfExists flag to graph creation to enable a user to specify that if a graph already exists, it should be overwritten without failing.
    • Progress logging:
      • We now log progress in equally spaced percentages. This is 0-100% either in steps of 1, or in ...
    Read more