Skip to content

Commit

Permalink
vault backup: 2023-10-04 17:40:33
Browse files Browse the repository at this point in the history
Affected files:
content/notes/university/year3/cs3002/cs3002-lab1.md
  • Loading branch information
pietraferreira committed Oct 4, 2023
1 parent 32444ce commit 74e2e05
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions content/notes/university/year3/cs3002/cs3002-lab1.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ $|32 - 18| + |110 - 85| + |23-27| =$

**Q3)** Describe two clustering methods and their advantages/disadvantages.

- Hierarchical: it is a way of grouping things together based on how similar or different they are from each other. For example, if you have different animals, you start by putting each in its own group. Then, we look at them and put the most similar together in a new group. You keep doing this until you have a few big groups that represent different categories of animals. We merge close clusters together and the result is a dendrogram.
- **Hierarchical**: it is a way of grouping things together based on how similar or different they are from each other. For example, if you have different animals, you start by putting each in its own group. Then, we look at them and put the most similar together in a new group. You keep doing this until you have a few big groups that represent different categories of animals. We merge close clusters together and the result is a dendrogram.

Pros:
- can produce an ordering of the objects, could be informative for data display.
Expand All @@ -55,7 +55,7 @@ Pros:
- cant reallocate object that has been 'incorrectly' grouped at an early stage.
- different distance metrics for measuring distances might generate different results.

- K-Means: we first decide how many groups we want, for example 3. we then randomly pick 3 "items" to be the centre of each group. now, we look at each item and see which center it is closest to, we put then with the closest centre. after we have them all together in groups, we find a new centre by taking the average of all the items in the group. we keep repeating this until the centres don't change too much.
- **K-Means**: we first decide how many groups we want, for example 3. we then randomly pick 3 "items" to be the centre of each group. now, we look at each item and see which centre it is closest to, we put then with the closest centre. after we have them all together in groups, we find a new centre by taking the average of all the items in the group. we keep repeating this until the centres don't change too much.

Pros:
- can be computationally faster than hierarchical if K is small.
Expand Down

0 comments on commit 74e2e05

Please sign in to comment.