[Community] Datawaves Workshop(How to explore On-Chain Data with Notebook)

Datawaves is a powerful tool for blockchain research. Datawaves have structured the Ethereum data so that you can use this data directly in your browsers via a notebook interface for data wrangling, data visualization, exploratory data analysis, and machine learning.

skill

  • Python
  • Basics of Pandas
  • Basics of Jupyter Notebook
  • Basics of data science package(sklearn/Altair/… the walk-through covers)

Workshop Topic and Structure
Part 1: Datawaves
Intro: motivations for Datawaves

  1. What’s Datawaves
  2. Basic Function:Data,Notebook,Share and Fork
  3. Difference with Colab and Dune

Intro: Spark

  1. What is Spark and Spark SQL?
  2. What is Spark Dataframe?
  3. Difference between Spark DataFrame and Pandas Dataframe
  4. How to transport a Spark DataFrame to a Pandas Dataframe?

Part 2: Data visualization with Datawaves

  1. Introduction to the problem(TBD)
  2. Notebook interface: Running cells, restarting & interrupting kernel.
  • Issues may occur when using notebook
  1. Data tables and querying with SQL magic
  • how tolist databases and tables in SQL
  • how to create a temporary view from a dataframe
  • how to print schema, columns, and rows of a dataframe
  1. Data preparation (TBD)
  • Intro to nft.trades table
  1. Writing Markdown

Part 3: Diving into Linear Regression
Linear Regression(TBD)

  1. Introduction to Linear Regression
  2. Intro to the problem
  3. Data preparation
  4. Code walkthrough

We plan to hold this workshop in early may.

Related Link