项目作者: graphistry

项目描述 :
PyGraphistry is a Python library to quickly load, shape, embed, and explore big graphs with the GPU-accelerated Graphistry visual graph analyzer
高级语言: Python
项目地址: git://github.com/graphistry/pygraphistry.git
创建时间: 2015-06-02T20:28:42Z
项目社区:https://github.com/graphistry/pygraphistry

开源协议:BSD 3-Clause "New" or "Revised" License

下载


PyGraphistry: Leverage the power of graphs & GPUs to visualize, analyze, and scale your data

Build Status
CodeQL
Documentation Status
Latest Version
Latest Version
License
PyPI - Downloads

Uptime Robot status
Twitter Follow






Demo: Interactive visualization of 80,000+ Facebook friendships (source data)

PyGraphistry is an open source Python library for data scientists and developers to leverage the power of graph visualization, analytics, AI, including with native GPU acceleration:

  • Prototype locally and deploy remotely: Prototype from notebooks like Jupyter and Databricks using local CPUs & GPUs, and then power production dashboards & pipelines with Graphistry Hub and your own self-hosted servers.

  • Query graphs with GFQL: Use GFQL, the first dataframe-native graph query language, to ask relationship questions that are difficult for tabular tools and without requiring a database.

  • graphistry[ai]: Call streamlined graph ML & AI methods to benefit from clustering, UMAP embeddings, graph neural networks, automatic feature engineering, and more.

  • Visualize & explore large graphs: In just a few minutes, create stunning interactive visualizations with millions of edges and many point-and-click built-ins like drilldowns, timebars, and filtering. When ready, customize with Python, JavaScript, and REST APIs.

  • Columnar & GPU acceleration: CPU-mode ingestion and wrangling is fast due to native use of Apache Arrow and columnar analytics, and the optional RAPIDS-based GPU mode delivers 100X+ speedups.

From global 10 banks, manufacturers, news agencies, and government agencies, to startups, game companies, scientists, biotechs, and NGOs, many teams are tackling their graph workloads with Graphistry.

The notebook demo gallery shares many more live visualizations, demos, and integration examples












Twitter Botnet
Edit Wars on Wikipedia
(data)
100,000 Bitcoin Transactions
Port Scan Attack
Protein Interactions
(data)
Programming Languages
(data)

Install

Common configurations:

  • Minimal core

    Includes: The GFQL dataframe-native graph query language, built-in layouts, Graphistry visualization server client

    1. pip install graphistry

    Does not include graphistry[ai], plugins

  • No dependencies and user-level

    1. pip install --no-deps --user graphistry
  • GPU acceleration - Optional

    Local GPU: Install RAPIDS and/or deploy a GPU-ready Graphistry server

    Remote GPU: Use the remote endpoints.

For further options, see the installation guides

Visualization quickstart

Quickly go from raw data to a styled and interactive Graphistry graph visualization:

  1. import graphistry
  2. import pandas as pd
  3. # Raw data as Pandas CPU dataframes, cuDF GPU dataframes, Spark, ...
  4. df = pd.DataFrame({
  5. 'src': ['Alice', 'Bob', 'Carol'],
  6. 'dst': ['Bob', 'Carol', 'Alice'],
  7. 'friendship': [0.3, 0.95, 0.8]
  8. })
  9. # Bind
  10. g1 = graphistry.edges(df, 'src', 'dst')
  11. # Override styling defaults
  12. g1_styled = g1.encode_edge_color('friendship', ['blue', 'red'], as_continuous=True)
  13. # Connect: Free GPU accounts and self-hosting @ graphistry.com/get-started
  14. graphistry.register(api=3, username='your_username', password='your_password')
  15. # Upload for GPU server visualization session
  16. g1_styled.plot()

Explore 10 Minutes to Graphistry Visualization for more visualization examples and options

PyGraphistry[AI] & GFQL quickstart - CPU & GPU

CPU graph pipeline combining graph ML, AI, mining, and visualization:

  1. from graphistry import n, e, e_forward, e_reverse
  2. # Graph analytics
  3. g2 = g1.compute_igraph('pagerank')
  4. assert 'pagerank' in g2._nodes.columns
  5. # Graph ML/AI
  6. g3 = g2.umap()
  7. assert ('x' in g3._nodes.columns) and ('y' in g3._nodes.columns)
  8. # Graph querying with GFQL
  9. g4 = g3.chain([
  10. n(query='pagerank > 0.1'), e_forward(), n(query='pagerank > 0.1')
  11. ])
  12. assert (g4._nodes.pagerank > 0.1).all()
  13. # Upload for GPU server visualization session
  14. g4.plot()

The automatic GPU modes require almost no code changes:

  1. import cudf
  2. from graphistry import n, e, e_forward, e_reverse
  3. # Modified -- Rebind data as a GPU dataframe and swap in a GPU plugin call
  4. g1_gpu = g1.edges(cudf.from_pandas(df))
  5. g2 = g1_gpu.compute_cugraph('pagerank')
  6. # Unmodified -- Automatic GPU mode for all ML, AI, GFQL queries, & visualization APIs
  7. g3 = g2.umap()
  8. g4 = g3.chain([
  9. n(query='pagerank > 0.1'), e_forward(), n(query='pagerank > 0.1')
  10. ])
  11. g4.plot()

Explore 10 Minutes to PyGraphistry for a wider variety of graph processing.

PyGraphistry documentation

Graphistry ecosystem

Community and support

Contribute

See CONTRIBUTING and DEVELOP for participating in PyGraphistry development, or reach out to our team