User Tools

Site Tools


sift:principal_component_analysis:mahalanobis_distance_and_spe_dialog

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
sift:principal_component_analysis:mahalanobis_distance_and_spe_dialog [2024/07/03 17:35] – created sgrangersift:principal_component_analysis:mahalanobis_distance_and_spe_dialog [2024/11/15 20:22] (current) – [Dialog] wikisysop
Line 1: Line 1:
-====== Mahalanobis_Distance_and_SPE_Dialog ======+====== Mahalanobis Distance and SPE Dialog ======
  
-[[Sift:Principal_Component_Analysis:Outlier_Detection_for_PCA#Mahalanobis_Distance_Test|Mahalanobis Distance]] and [[Sift:Principal_Component_Analysis:Outlier_Detection_for_PCA#Squared_Prediction_Error_(SPE)|SPE]] are common measures used to determine outliers in a data sample. The Mahalanobis distance can be conceptualized as the distance from a point to a centroid of a data set, taking into account correlations in the data set. The Mahalanobis distance method can be used on PCA results. This is done by measuring the distance of each point to the centroid in the transformed PCA space. Alternatively, the SPE can be understood as the distance from the original data, to the PCA reduced/transformed data points (model prediction vs the true model measurement), i.e. the distance from the original point to it's projection into the PCA hyperplane.+[[Sift:Principal_Component_Analysis:Outlier_Detection_for_PCA#Mahalanobis_Distance_Test|Mahalanobis Distance]] and [[Sift:Principal_Component_Analysis:Outlier_Detection_for_PCA#Squared_Prediction_Error_(SPE)|SPE]] are common measures used to determine outliers in a data sample.  
 + 
 +  * The Mahalanobis distance can be conceptualized as the distance from a point to a centroid of a data set, taking into account correlations in the data set. The Mahalanobis distance method can be used on PCA results. This is done by measuring the distance of each point to the centroid in the transformed PCA space.  
 +  * The SPE can be understood as the distance from the original data, to the PCA reduced/transformed data points (model prediction vs the true model measurement), i.e. the distance from the original point to it's projection into the PCA hyperplane.
  
 As covered in our documentation about [[Sift:Principal_Component_Analysis:Outlier_Detection_for_PCA|Outlier Detection Methods]], Mahalanobis Distance and SPE complement each other quite well, and as such we decided to conjoin them into a single dialog, allowing for easier use of both methods. As covered in our documentation about [[Sift:Principal_Component_Analysis:Outlier_Detection_for_PCA|Outlier Detection Methods]], Mahalanobis Distance and SPE complement each other quite well, and as such we decided to conjoin them into a single dialog, allowing for easier use of both methods.
Line 7: Line 10:
 The Mahalanobis Distance and SPE are found on the toolbar and under 'Outlier Detecting Using PCA' in the Analysis menu. The Mahalanobis Distance and SPE are found on the toolbar and under 'Outlier Detecting Using PCA' in the Analysis menu.
  
-{{MD_button.png}}+{{:MD_button.png}}
  
 ==== Dialog ==== ==== Dialog ====
  
-{{MD_dlg.png}}+{{ :MD_dlg.png}}
  
   * **Grouping to Search:** What kind of grouping is used to determine the centroid, Combined Groups, Groups, Workspaces   * **Grouping to Search:** What kind of grouping is used to determine the centroid, Combined Groups, Groups, Workspaces
-  * **Auto-exclude results:** If checked and outliers found will automatically be removed+  * **Auto-exclude results:** If checked any outliers found will automatically be removed
   * **Number of Passes:** How many times should the test be run, removing an outlier may alter the centroid, exposing more outliers   * **Number of Passes:** How many times should the test be run, removing an outlier may alter the centroid, exposing more outliers
   * **Find All Outliers:** If checked, the Number of Passes parameter will be ignored, and the test will be run until no outliers are found   * **Find All Outliers:** If checked, the Number of Passes parameter will be ignored, and the test will be run until no outliers are found
Line 24: Line 27:
 ==== SPE ==== ==== SPE ====
  
-Since Squared Prediction Error compares a single predictive point to its original value, many of the parameters in the dialog do not apply, the only parameters of note for SPE are:+Since Squared Prediction Error compares a single predictive point to its original value, some of the parameters in the dialog do not apply. These are: 
  
-  * **Auto-exclude results:** If checked and outliers found will automatically be removed +  * **Grouping to Search:** SPE is not grouped 
-  * **Number of PCs:** How many principal components should be considered for the test +  * **Number of Passes/Find All Outliers:** Removing another outlier does not effect if a SPE is an outlier or not.
-  * **Outlier alpha value:** The threshold used to determine an outlier+
  
 ==== Results ==== ==== Results ====
sift/principal_component_analysis/mahalanobis_distance_and_spe_dialog.1720028154.txt.gz · Last modified: 2024/07/03 17:35 by sgranger