Man1130/slides.md at e44aead3d519da9b77b6537b1846afdad5124db4 - Man1130 - Gitea: Git with a cup of tea

004_comission/Man1130

Files

louiscklaw e44aead3d5 update,

2025-02-01 01:58:19 +08:00

5.7 KiB

Raw Blame History

marp, title, description, theme, paginate, _paginate, backgroundImage, footer, style

marp	title	description	theme	paginate	_paginate	backgroundImage	footer	style
true	Marp CLI example	Hosting Marp slide deck on the web	uncover	true	false	url('https://www.google.com/url?sa=i&url=https%3A%2F%2Fcommons.wikimedia.org%2Fwiki%2FFile%3AHelloWorld.svg&psig=AOvVaw0d3lmyaMphPi0ANeGIEJOw&ust=1670049479380000&source=images&cd=vfe&ved=0CBAQjRxqFwoTCJjxx6Gp2vsCFQAAAAAdAAAAABAE')	2022 project presentation	section { background-color: #ccc; padding: 0 10vw; } footer { text-align: center; }

slide topic?

overview of the dataset

Male and Female proportion
- male by age
- female by age
The types of chest pain experienced among the patients
- pie chart
showing the correlation between chest pain and target Confusion Matrix ?

Decision Tree

overview of the dataset

Decision Tree 果張圖
- Performance Analysis
  - confusion matrix
  - ROC

Naive Bayes

overview of the dataset

Naive Bayes 果張圖
- Performance Analysis
  - confusion matrix
  - ROC

Logistic Regression

overview of the dataset

Logistic Regression 果張圖

Performance Analysis

Performance Analysis
- confusion matrix
- ROC

Performance Analysis con't

pick a sample, bayes modeling:
- naive_bayes_scramble.ipynb
- column selection
  - compare column vs accurancy
the performance/accurancy of fewer column MAY BE better than more column
- possible cause
  - column noise/ input data accurancy ?
  - modal overfitting ?
  - extreme case ?

Performance Analysis con't

how to improve ?
- the choice of the columns may be better if other facuity involved.
- more labelled data improves accuracy

Performance Analysis con't

disclaimer ?
- no model can introduce 100% accurancy
- why ?
  - extreme case
  - chaos theory ?
    - will never take all ~~ervery~~ matters into account
however, the model can be considered if accuracy above nn% in general

why these method ?

unsupervised modeling
- cross out reason
  - data is already labelled
  - data is small amount and discrete
- Knn
- Kmeans

why these method ?

supervised modeling / supervised learning
- data is already labelled
  - []Decision Tree
  - []Naive Bayes
    - performance ? accuracy ? ROI ?
- multi dimensional/column difficult to understand/maintain
  - Logistic Regression
  - SVM

notes only