User Tools

Site Tools


sift:tutorials:perform_principal_component_analysis

Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revisionPrevious revision
Next revision
Previous revision
sift:tutorials:perform_principal_component_analysis [2024/07/26 15:27] sgrangersift:tutorials:perform_principal_component_analysis [2024/11/28 19:15] (current) – [Perform Principal Component Analysis] wikisysop
Line 3: Line 3:
 This tutorial will show you how to use Sift in order to perform [[Sift:Principal_Component_Analysis:Principal_Component_Analysis|Principal Component Analysis]] (PCA) using data from a [[Visual3D:Documentation:Definitions:CMO_Library_|CMO library]]. For a full treatment of waveform-based PCA to find differences in waveform data, see the explanation presented in [[https://us.humankinetics.com/products/research-methods-in-biomechanics-2nd-edition|the Research Methods in Biomechanics textbook]]. This tutorial will show you how to use Sift in order to perform [[Sift:Principal_Component_Analysis:Principal_Component_Analysis|Principal Component Analysis]] (PCA) using data from a [[Visual3D:Documentation:Definitions:CMO_Library_|CMO library]]. For a full treatment of waveform-based PCA to find differences in waveform data, see the explanation presented in [[https://us.humankinetics.com/products/research-methods-in-biomechanics-2nd-edition|the Research Methods in Biomechanics textbook]].
  
-For this tutorial, we will be comparing the the knee flexion angles between participants with osteoarthritis and the normal control group. Our problem is to provide an explanation for differences in knee flexion angles between osteoarthritic walking versus normal walking. We can accomplish this by defining two groups that meet these signal definitions, performing PCA, and interpreting the results.+For this tutorial, we will be comparing the knee flexion angles between participants with osteoarthritis and the normal control group. Our problem is to provide an explanation for differences in knee flexion angles between osteoarthritic walking versus normal walking. We can accomplish this by defining two groups that meet these signal definitions, performing PCA, and interpreting the results.
  
 +If you prefer, a video tutorial is available outlining the same process. It is available at this link: [[https://youtu.be/6lMsQpSx9BI?feature=shared|Sift Tutorial Video 3: Performing Principal Component Analysis (PCA)]]
 ==== Data ==== ==== Data ====
  
Line 13: Line 14:
 As with previous tutorials, we begin by loading the CMZ library and defining the queries relevant to our question. As with previous tutorials, we begin by loading the CMZ library and defining the queries relevant to our question.
  
-1. Click {{:sift_library_load.png?20}} **Load Library** in the [[Sift:Application:Toolbar|toolbar]] to open the Load Library dialog. +  - Click {{:sift_library_load.png?20}} **Load Library** in the [[Sift:Application:Toolbar|toolbar]] to open the Load Library dialog. 
- +  Click {{:sift_browser.png?20}} **Browse** and select the folder where the data is stored. 
-2. Click {{:sift_browser.png?20}} **Browse** and select the folder where the data is stored. +  Click {{:sift_apply.png?20}} **Load** button to import the data.
- +
-3. Click {{:sift_apply.png?20}} **Load** button to import the data.+
  
 ==== Define queries and calculate groups ==== ==== Define queries and calculate groups ====
Line 23: Line 22:
 For this tutorial we will manually create two groups based on tags, one for subjects with osteoarthritis and one for normal control subjects. We begin by defining a query for subjects with the OA tag (indicating osteoarthritis). For this tutorial we will manually create two groups based on tags, one for subjects with osteoarthritis and one for normal control subjects. We begin by defining a query for subjects with the OA tag (indicating osteoarthritis).
  
-1. Click on the {{:sift_query_builder.png?20}} **Query Builder** icon on the [[Sift:Application:Toolbar|toolbar]], or on the left panel of the [[Sift:Application:Explore_Page|Explore Page]], to open the Query Builder dialog. +  - Click on the {{:sift_query_builder.png?20}} **Query Builder** icon on the [[Sift:Application:Toolbar|toolbar]], or on the left panel of the [[Sift:Application:Explore_Page|Explore Page]], to open the Query Builder dialog. 
- +  {{:sift_action_add.png?20}} Add a query, name it OA, and click **Save**. 
-2. {{:sift_action_add.png?20}} Add a query, name it OA, and click **Save**. +  {{:sift_action_add.png?20}} Add a condition and name it OA. 
- +    **Signals**: Set TYPE - DERIVED, FOLDER - PCA, NAME - RKNEE_ANGLE, COMPONENT - X. This is the only signal in the data set. 
-3. {{:sift_action_add.png?20}} Add a condition and name it OA. +    **Events**: There are no events in this data set, so this tab can be skipped. 
- +    **Refinements**: Check the **Refine using tag** checkbox and select the OA tag. 
-3.1. **Signals**: Set TYPE - DERIVED, FOLDER - PCA, NAME - RKNEE_ANGLE, COMPONENT - X. This is the only signal in the data set. +    Click **Save**
- +
-3.2. **Events**: There are no events in this data set, so this tab can be skipped. +
- +
-3.3. **Refinements**: Check the **Refine using tag** checkbox and select the OA tag. +
- +
-3.4. Click **Save**+
  
 Next we will define a query for subjects with the NC tag (indicating Normal Control). In this case we can easily modify our previous query rather than starting from scratch. Next we will define a query for subjects with the NC tag (indicating Normal Control). In this case we can easily modify our previous query rather than starting from scratch.
  
-1. {{:sift_action_add.png?20}} Add a query, name it NC, and click **Save**. +  - {{:sift_action_add.png?20}} Add a query, name it NC, and click **Save**. 
- +  {{:sift_action_add.png?20}} Add a condition and name it NC. 
-2. {{:sift_action_add.png?20}} Add a condition and name it NC. +  In the **Refinements** tab, change the selected tag from OA to NC. 
- +  Click **Save**.
-3. In the **Refinements** tab, change the selected tag from OA to NC. +
- +
-4. Click **Save**.+
  
 You can verify here that the new NC group has the same signal and event selections as the OA group. Click **Calculate All Queries** and then close the Query Builder dialog. You can verify here that the new NC group has the same signal and event selections as the OA group. Click **Calculate All Queries** and then close the Query Builder dialog.
Line 53: Line 43:
 There will now be two Groups in the [[Sift:Application:Explore_Page|Explore Page]] subwindow's **Groups** list. Selecting either of these will display the associated workspaces in the **Workspaces** list below. We can verify the queries that we produced in the first section by visualizing our traces. There will now be two Groups in the [[Sift:Application:Explore_Page|Explore Page]] subwindow's **Groups** list. Selecting either of these will display the associated workspaces in the **Workspaces** list below. We can verify the queries that we produced in the first section by visualizing our traces.
  
-1. Set the plot type to Signal-Time. +  - Set the plot type to Signal-Time. 
- +  Select all groups and all workspaces. 
-2. Select all groups and all workspaces. +  Check only the **Plot Workspace Mean** checkbox. 
- +  Click **Refresh Plot**.
-3. Check only the **Plot Workspace Mean** checkbox. +
- +
-4. Click **Refresh Plot**.+
  
 The plot that is produced will not be very informative if the traces are not coloured by group, which is the comparison we are interested in. If this is the case, open the {{:sift_data_options.png?20}} **Show Data Options** dialog from the application tool bar and in the top right of the dialog under **Display Styles from...** select "Group". The plot that is produced will not be very informative if the traces are not coloured by group, which is the comparison we are interested in. If this is the case, open the {{:sift_data_options.png?20}} **Show Data Options** dialog from the application tool bar and in the top right of the dialog under **Display Styles from...** select "Group".
Line 73: Line 60:
 ==== Running Principal Component Analysis ==== ==== Running Principal Component Analysis ====
  
-1. Ensure that all groups and workspaces are selected in the **Groups** and **Workspaces** lists. +  - Ensure that all groups and workspaces are selected in the **Groups** and **Workspaces** lists. 
- +  Select the {{:sift_run_pca.png?20}} icon on the [[Sift:Application:Toolbar|toolbar]]. This will bring you to the analysis page and prompt the PCA settings dialog. 
-2. Select the {{:sift_run_pca.png?20}} icon on the [[Sift:Application:Toolbar|toolbar]]. This will bring you to the analysis page and prompt the PCA settings dialog. +  Set the name for this PCA. 
- +  Set **Number PCs** to 4. 
-3. Set the name for this PCA. +  Ensure that **Use Workspace Mean** is checked. 
- +  Click **Run PCA**. 
-4. Set **Number PCs** to 4. +  The results of these calculations will automatically populate the PCA graphs on the [[Sift:Application:Analyse_Page|Analyse Page]]
- +
-5. Ensure that **Use Workspace Mean** is checked. +
- +
-6. Click **Run PCA**. +
- +
-7. The results of these calculations will automatically populate the PCA graphs on the [[Sift:Application:Analyse_Page|Analyse Page]]+
  
  
Line 94: Line 75:
 === Variance Explained === === Variance Explained ===
  
-The Variance Explained window, which displays the variance explained by each principal component as well as the cumulative variance for each principal components. It is important to verify that the calculated principal components do explain a significant amount of the data set's variability. A good heuristic to use is that you want enough principal components to explain 95% of the data set's variety, otherwise there will be at least a moderate amount of variation that your analysis has not captured. In this example, our 4 principal components explain 96% of the data set's variability, which is sufficient and we can continue the exploration. If there less than 95% of the data set's variance was explained then we should re-run the analysis with more principal components.+The Variance Explained window, which displays the variance explained by each principal component as well as the cumulative variance for each principal components. It is important to verify that the calculated principal components do explain a significant amount of the data set's variability. A good heuristic to use is that you want enough principal components to explain 95% of the data set's variety, otherwise there will be at least a moderate amount of variation that your analysis has not captured. In this example, our 4 principal components explain 96% of the data set's variability, which is sufficient and we can continue the exploration. If less than 95% of the data set's variance was explained then we should re-run the analysis with more principal components.
  
 {{:Sift_pca_tut_Results1.png?800}} {{:Sift_pca_tut_Results1.png?800}}
sift/tutorials/perform_principal_component_analysis.1722007620.txt.gz · Last modified: 2024/07/26 15:27 by sgranger