BinaryClassificationMetrics¶
- 
class pyspark.mllib.evaluation.BinaryClassificationMetrics(scoreAndLabels: pyspark.rdd.RDD[Tuple[float, float]])[source]¶
- Evaluator for binary classification. - New in version 1.4.0. - Parameters
- scoreAndLabelspyspark.RDD
- an RDD of score, label and optional weight. 
 
- scoreAndLabels
 - Examples - >>> scoreAndLabels = sc.parallelize([ ... (0.1, 0.0), (0.1, 1.0), (0.4, 0.0), (0.6, 0.0), (0.6, 1.0), (0.6, 1.0), (0.8, 1.0)], 2) >>> metrics = BinaryClassificationMetrics(scoreAndLabels) >>> metrics.areaUnderROC 0.70... >>> metrics.areaUnderPR 0.83... >>> metrics.unpersist() >>> scoreAndLabelsWithOptWeight = sc.parallelize([ ... (0.1, 0.0, 1.0), (0.1, 1.0, 0.4), (0.4, 0.0, 0.2), (0.6, 0.0, 0.6), (0.6, 1.0, 0.9), ... (0.6, 1.0, 0.5), (0.8, 1.0, 0.7)], 2) >>> metrics = BinaryClassificationMetrics(scoreAndLabelsWithOptWeight) >>> metrics.areaUnderROC 0.79... >>> metrics.areaUnderPR 0.88... - Methods - call(name, *a)- Call method of java_model - Unpersists intermediate RDDs used in the computation. - Attributes - Computes the area under the precision-recall curve. - Computes the area under the receiver operating characteristic (ROC) curve. - Methods Documentation - 
call(name: str, *a: Any) → Any¶
- Call method of java_model 
 - 
unpersist() → None[source]¶
- Unpersists intermediate RDDs used in the computation. - New in version 1.4.0. 
 - Attributes Documentation - 
areaUnderPR¶
- Computes the area under the precision-recall curve. - New in version 1.4.0. 
 - 
areaUnderROC¶
- Computes the area under the receiver operating characteristic (ROC) curve. - New in version 1.4.0.