- Supervised Learning
- Linear Regression
-
Classification
- Logistic Regression
- Support Vector Machine: good for non-linear classification
- Unsupervised Learning
-
Lower dimension representation
- Principle Component Analysis
-
Spare representation
- K-Means
- Gaussian Mixture Models
-
Independent representation
- Principle Component Analysis
Cost Function
- Regularization
- Maximum Likelihood
- KL divergence
- cross-entropy
Graph Processing
- frameworks
- PageRank: direct graph by Google
- Pregel
- Giraph
- GraphLab
- GraphX
GraphX
GraphX abstracts a graph with an RDD of vertices and an RDD of
edges
-
Connectd Components:
org.apache.spark.graphx.lib.connectedComponents
-
Triangle Counting:
org.apache.spark.graphx.lib.triangleCount
-
Shortest Paths:
org.apache.spark.graphx.lib.Shortestpaths
Comments !