项目作者: achuthasubhash

项目描述 :
Complete-Life-Cycle-of-a-Data-Science-Project
高级语言:
项目地址: git://github.com/achuthasubhash/Complete-Life-Cycle-of-a-Data-Science-Project.git


Complete-Life-Cycle-of-a-Data-Science-Project

CREDITS:All corresponding resources

MOTIVATION:Motivation to create this repository to help upcoming aspirants and help to others in the data science field

https://www.theinsaneapp.com/2021/03/how-to-build-machine-learning-project.html

If you like my work. please buy me a coffee it motivate me -> https://www.buymeacoffee.com/achuthasubhash?new=1

Business understanding

1.Data collection

Data consists of 3 kinds

  1. a.Structure data (tabular data,etc...)
  2. b.Unstructured data (images,text,audio,etc...)
  3. c.semi structured data (XML,JSON,etc...)

variable

  1. a.qualitative (nominal,ordinal,binary)
  2. b.quantitative(discrete,continuous)
  3. https://www.chi2innovations.com/blog/discover-data-blog-series/data-types-101/

database scraping data from websites purchasing data data from surveys data, sensors, cameras, apis etc.

cleanlab https://l7.curtisnorthcutt.com/cleanlab-python-package https://github.com/cgnorthcutt/cleanlab https://github.com/cgnorthcutt/label-errors https://github.com/cgnorthcutt/rankpruning https://github.com/subeeshvasu/Awesome-Learning-with-Label-Noise

Measure Data Quality ydata-quality https://github.com/ydataai/ydata-synthetic https://towardsdatascience.com/how-can-i-measure-data-quality-9d31acfeb969

a.Web scraping best article to refer-https://towardsdatascience.com/choose-the-best-python-web-scraping-library-for-your-application-91a68bc81c4f

https://www.analyticsvidhya.com/blog/2019/10/web-scraping-hands-on-introduction-python/?utm_source=linkedin&utm_medium=KJ|link|weekend-blogs|blogs|44087|0.875

https://www.analyticsvidhya.com/blog/2019/10/web-scraping-hands-on-introduction-python/?utm_source=linkedin&utm_medium=AV|link|high-performance-blog|blogs|44204|0.375

https://www.kdnuggets.com/2021/02/6-web-scraping-tools.html

https://www.bigdatanews.datasciencecentral.com/profiles/blogs/top-30-free-web-scraping-software

https://towardsdatascience.com/6-web-scraping-tools-that-make-collecting-data-a-breeze-457c44e4411d

https://medium.com/analytics-vidhya/master-web-scraping-completly-from-zero-to-hero-38051423256b

  1. 1.Beautifulsoup https://www.freecodecamp.org/news/how-to-scrape-websites-with-python-and-beautifulsoup-5946935d93fe/
  2. mechanicalsoup https://analyticsindiamag.com/mechanicalsoup-web-scraping-custom-dataset-tutorial/
  3. 2.Scrapy,PyScrappy,Pandas Datareader,Instaloader,lxml
  4. 3.Selenium https://www.freecodecamp.org/news/better-web-scraping-in-python-with-selenium-beautiful-soup-and-pandas-d6390592e251/
  5. 4.Request to access data
  6. 5.AUTOSCRAPER - https://github.com/alirezamika/autoscraper https://www.youtube.com/watch?v=9BQ353Yu1D0 https://www.analyticsvidhya.com/blog/2021/04/automate-web-scraping-using-python-autoscraper-library/
  7. scrapeasy Scrape Any Website in Seconds with One Line of Code https://github.com/joelbarmettlerUZH/Scrapeasy
  8. Scrap Images From E-Commerce Website Using AutoScraper https://www.analyticsvidhya.com/blog/2021/05/scrap-images-from-e-commerce-website-using-autoscraper-library/
  9. amazon auto scraper library https://webautomation.io/
  10. Listly https://www.listly.io/r/stdfr
  11. FiftyOne Now easier to download and evaluate https://towardsdatascience.com/googles-open-images-now-easier-to-download-and-evaluate-with-fiftyone-615ce0482c02
  12. webbot https://pypi.org/project/webbot/
  13. gazpacho https://github.com/maxhumber/gazpacho
  14. html_scraper_streamlit_app https://www.youtube.com/watch?v=6U5xJ3mXRKA&feature=youtu.be
  15. 6.Twitter scraping tool (𝚝𝚠𝚒𝚗𝚝 or tweepy or tweetlib)-https://github.com/twintproject/twint
  16. twitterscraper https://www.youtube.com/watch?v=MpIi4HtCiVk
  17. twython https://github.com/ryanmcgrath/twython
  18. twarc https://github.com/DocNow/twarc https://scholarslab.github.io/learn-twarc/01-quick-start.html
  19. snscrape extract twitterr data https://github.com/JustAnotherArchivist/snscrape
  20. Scweet A simple and unlimited twitter scraper https://github.com/Altimis/Scweet
  21. GetOldTweets3,GoogleNews,snscrape,GetOldTweets3
  22. Scrape Twitter for Tweets https://github.com/taspinar/twitterscraper
  23. HAR File Web Scraper https://stevesie.com/har-file-web-scraper https://www.youtube.com/watch?v=LcqVDfueb8g
  24. https://analyticsindiamag.com/complete-tutorial-on-twint-twitter-scraping-without-twitters-api/
  25. https://developer.twitter.com/en/docs
  26. pytrends https://medium.com/nerd-for-tech/scraping-data-from-online-platforms-to-enhance-time-series-forecasts-6eec3c68636d
  27. Scraping Instagram -instaloader https://thecleverprogrammer.com/2020/07/30/scraping-instagram-with-python/
  28. Instascrape
  29. Scrape LinkedIn Profiles with ProxyCurl API
  30. Reddit Dataset Using PSAW and PRAW in Python
  31. Scraping Reddit using Python Reddit API Wrapper (PRAW)
  32. Scrape Wikipedia wikipedia https://www.thepythoncode.com/article/access-wikipedia-python
  33. patang - Scrape Product details from eCommerce Sites with Puppeteer and DOM String https://www.youtube.com/watch?v=3sgxRmyOuXs
  34. Download Wikipedia https://www.wikidata.org/wiki/Wikidata:Main_Page https://www.youtube.com/watch?v=hC1rY4lRY0s https://towardsdatascience.com/an-efficient-way-to-read-data-from-the-web-directly-into-python-a526a0b4f4cb
  35. Web Scraping to Create a CSV File https://thecleverprogrammer.com/2020/08/08/web-scraping-to-create-csv/
  36. Amazon Web Scraper, Amazon Auto Scraper
  37. 7.urllib
  38. 8.pattern
  39. 9.Octoparse Easy Web Scraping https://www.octoparse.com/
  40. prowebscraper https://prowebscraper.com/features
  41. Web scraper https://chrome.google.com/webstore/detail/web-scraper-free-web-scra/jnhgnonknehpejjnehehllkliplmbmhn?hl=en
  42. ParseHub https://www.parsehub.com/ https://analyticsindiamag.com/parsehub-no-code-gui-based-web-scraping-tool/
  43. PyScrappy https://github.com/mldsveda/PyScrappy https://www.analyticsvidhya.com/blog/2022/02/web-scraping-with-pyscrappy/
  44. Gazpacho https://github.com/maxhumber/gazpacho
  45. ScrapeSimple Website: https://www.scrapesimple.com
  46. Content Grabber https://contentgrabber.com/Manual/understanding_the_concept.htm
  47. Crawly https://crawly.diffbot.com/
  48. Apify https://apify.com/
  49. Mozenda Website: https://www.mozenda.com/
  50. obsei https://github.com/lalitpagaria/obsei
  51. Diffbot https://analyticsindiamag.com/diffbot/
  52. Trustpilot,webhose,scrapingbot
  53. lxml https://lxml.de/index.html#introduction
  54. ScrapingBee https://analyticsindiamag.com/scrapingbee-api/
  55. Scrape HTML tables https://www.youtube.com/watch?v=6U5xJ3mXRKA&feature=youtu.be or pd.read_html
  56. requests-html https://github.com/kennethreitz/requests-html
  57. newspaper https://github.com/codelucas/newspaper https://www.youtube.com/watch?v=Hfry5XnISyc
  58. newspaper3k: https://newspaper.readthedocs.io # easily extract text from articles
  59. newscatcher https://github.com/kotartemiy/newscatcher https://www.youtube.com/watch?v=pHzOuizZq4I
  60. patang (extract product details) https://github.com/tejazz/patang
  61. lisc https://github.com/lisc-tools/lisc
  62. Helena WEB AUTOMATION FOR END USERS https://helena-lang.org/
  63. pandas(read_html)
  64. wget,curl,parsehub,webhouse,octoparse,scraping bot,scraping bee,Common,Content Grabber,Docparser,Scraper API,Import.io,Altair Monarch,WebAutomation.io,WebScraper.io,Scrape.do, AvesAPI, ParseHub, Import.io, Octoparse, Scrapingdog, Diffbot, ScrapingBee, Grepsr, Scraper API, Scrapy
  65. Crawl Crawly https://crawly.diffbot.com/
  66. HTML basics for web scraping,Web Scraping with Octoparse,Web Scraping with Selenium
  67. 10-best-web-scraping-tools https://www.scraperapi.com/blog/the-10-best-web-scraping-tools/
  68. https://www.kdnuggets.com/2021/02/6-web-scraping-tools.html
  69. https://analyticsindiamag.com/complete-learning-path-to-web-scraping-with-all-major-tools/ https://towardsdatascience.com/6-web-scraping-tools-that-make-collecting-data-a-breeze-457c44e4411d
  70. https://towardsdatascience.com/6-web-scraping-tools-that-make-collecting-data-a-breeze-457c44e4411d https://www.kdnuggets.com/2018/02/web-scraping-tutorial-python.html
  71. https://www.octoparse.com/ https://github.com/tirthajyoti/pydbgen https://www.mozenda.com/ https://www.mockaroo.com/ https://lionbridge.ai/ https://www.mturk.com/ https://appen.com/
  72. 11.GoogleImageCrawler,google_images_download,bing_image
  73. https://www.freepik.com/popular-photos , https://stocksnap.io/ , https://www.pexels.com/ ,https://unsplash.com/ , https://pixabay.com/

b.Web Crawling

https://python.libhunt.com/scrapy-alternatives

Flat Data https://octo.github.com/projects/flat-data

b.3rd party API’S

22 APIs every data scientist should learn https://www.springboard.com/library/data-science/top-apis-for-data-scientists/

c.creating own data (manual collection eg:google docx,servey,etc…) primary data

d.etl awesome ETL https://github.com/pawl/awesome-etl#python https://github.com/achuthasubhash/awesome-etl

38x faster data pipelines with tf.data

d.Databases

Databases are 2 kind sequel and no sequel database

sql,sql lite,mysql,mongodb,montydb,hadoop,elastic search,cassendra,amazon s3,hive,googlebigtable,AWS DynamoDB,HBase,oracle db

sql https://mode.com/sql-tutorial/ https://www.w3schools.com/sql/

sql in python https://medium.com/jbennetcodes/how-to-rewrite-your-sql-queries-in-pandas-and-more-149d341fc53e

PyMongo https://analyticsindiamag.com/guide-to-pymongo-a-python-wrapper-for-mongodb/

Cloud AI Data labeling service https://cloud.google.com/ai-platform/data-labeling/docs?utm_source=youtube&utm_medium=Unpaidsocial&utm_campaign=guo-20200503-Data-Labeling

e.Online resources - ultimate resource https://datasetsearch.research.google.com/ https://medium.com/swlh/where-to-find-awesome-machine-learning-datasets-6bb909a3f350

10 BEST DATA COLLECTION TOOLS FOR EFFECTIVE RESULTS https://www.analyticsinsight.net/10-best-data-collection-tools-for-effective-results/

https://www.freecodecamp.org/news/https-medium-freecodecamp-org-best-free-open-data-sources-anyone-can-use-a65b514b0f2d/ https://research.google/tools/datasets/

Machine learning datasets https://www.datasetlist.com/ https://wiki.pathmind.com/open-datasets

https://guides.library.cmu.edu/az.php https://docs.microsoft.com/en-us/azure/azure-sql/public-data-sets https://registry.opendata.aws/ https://paperswithcode.com/datasets https://datasets.quantumstat.com/ https://www.quandl.com/ http://dataportals.org/ https://opendatamonitor.eu/frontend/web/index.php?r=dashboard%2Findex https://en.wikipedia.org/wiki/List_of_datasets_for_machine-learning_research https://www.quora.com/Where-can-I-find-large-datasets-open-to-the-public https://www.reddit.com/r/datasets/ https://ourworldindata.org/ https://data.worldbank.org/ https://data.world/ https://data.census.gov/cedsci/ https://data.seattle.gov/ https://www.openml.org/ https://visualdata.io/discovery

World’s Largest Data Platform https://worlddata.ai/

  1. Awesome list of datasets in 100+ categories https://www.kdnuggets.com/2021/05/awesome-list-datasets.html
  2. https://sebastianraschka.com/blog/2021/ml-dl-datasets.html https://enoumen.com/2021/04/23/data-sciences-datasets-data-visualization-data-analytics-big-data-data-lakes/
  3. https://serokell.io/blog/best-machine-learning-datasets https://medium.com/@ODSC/25-excellent-machine-learning-open-datasets-940ca2124dfc
  4. 1)kaggle-https://www.kaggle.com/datasets , 𝚙𝚒𝚙 𝚒𝚗𝚜𝚝𝚊𝚕𝚕 𝚔𝚊𝚐𝚐𝚕𝚎𝚍𝚊𝚝𝚊𝚜𝚎𝚝𝚜
  5. Downloading Kaggle datasets directly into Google Colab -https://towardsdatascience.com/downloading-kaggle-datasets-directly-into-google-colab-c8f0f407d73a
  6. How to Download Kaggle Datasets using Jupyter Notebook https://www.analyticsvidhya.com/blog/2021/04/how-to-download-kaggle-datasets-using-jupyter-notebook/
  7. 2)https://sebastianraschka.com/blog/2021/ml-dl-datasets.html
  8. movielens-https://grouplens.org/datasets/movielens/latest/
  9. dagshub datset https://dagshub.com/explore/datasets
  10. 100+ of the Best Free Data Sources For Your Next Project https://www.columnfivemedia.com/100-best-free-data-sources-infographic/
  11. World and national data, maps & rankings https://knoema.com/atlas/sources
  12. 3)data.gov-https://data.gov.in/
  13. 4)uci-https://archive.ics.uci.edu/ml/datasets.php https://github.com/tirthajyoti/UCI-ML-API
  14. 5)Group Lens dataset https://grouplens.org/
  15. Wikipedia ML Datasets https://en.wikipedia.org/wiki/List_of_datasets_for_machine-learning_research
  16. AWS Open Data Registry,data.gov (portals),YELP Open dataset,UNICEF Dataset,Big Bad NLP Database,Microsoft Dataset
  17. 6)world3bank https://data.world/ , worldbank
  18. 7)Google Cloud BigQuery public datasets
  19. Google Public Datasets-cloud.google.com/bigquery/public-data/
  20. Google Cloud Data Catalog https://cloud.google.com/data-catalog
  21. Academic Torrents-https://academictorrents.com/check.htm?returnto=%2Fbrowse.php
  22. 8)online hacktons
  23. Datasets https://www.paperswithcode.com/datasets
  24. 9)image data from google_images_download
  25. https://www.visualdata.io/discovery
  26. http://xviewdataset.org/#dataset
  27. https://ai.googleblog.com/2016/09/introducing-open-images-dataset.html
  28. 10)image data from Bing_Search
  29. image data from simple_image_download https://github.com/RiddlerQ/simple_image_download
  30. 11)https://www.columnfivemedia.com/100-best-free-data-sources-infographic
  31. graviti Unleash the Power of Unstructured Data https://www.graviti.com/?utm_medium=0730Ismael
  32. 12)Reddit:https://lnkd.in/dv5UCD4 https://www.reddit.com/r/datasets/
  33. praw.Reddit https://github.com/praw-dev/praw
  34. 13)https://datasets.bifrost.ai/?ref=producthunt
  35. 14)data.world:https://lnkd.in/gEK897K
  36. 15)https://data.world/datasets/open-data
  37. https://tinyletter.com/data-is-plural
  38. 16)FiveThirtyEight :- https://lnkd.in/gyh-HDj , https://data.fivethirtyeight.com/
  39. 17)BuzzFeed :- https://lnkd.in/gzPWyHj
  40. Buzzfeed News -github.com/BuzzFeedNews
  41. Socrata - https://opendata.socrata.com/
  42. 18)Google public datasets :- https://lnkd.in/g5dH8qE
  43. Statistics Canada https://www.statcan.gc.ca/eng/start https://towardsdatascience.com/how-to-collect-data-from-statistics-canada-using-python-db8a81ce6475
  44. Deep Image Search AI-based image search engine https://github.com/TechyNilesh/DeepImageSearch
  45. https://www.datasciencecentral.com/profiles/blogs/big-data-sets-available-for-free
  46. 19)Quandl :- https://www.quandl.com stock data
  47. statista : https://www.statista.com/ stock data
  48. 20)socorateopendata :- https://lnkd.in/gea7JMz
  49. 21)AcedemicTorrents :- https://lnkd.in/g-Ur9Xy
  50. 22) Automates Image Annotation for Deep Learning Models https://medium.com/towards-artificial-intelligence/improving-data-labeling-efficiency-with-auto-labeling-uncertainty-estimates-and-active-learning-5848272365be
  51. Label Studio,Sloth,LabelBox,TagTog,Amazon SageMaker GroundTruth,Playment,Superannotate,Playment,Dataturk,LightTag,Superannotate,CVAT,sloth,LabelImg,cvat
  52. Automate data preparation https://www.superb-ai.com/
  53. https://neptune.ai/blog/annotation-tool-comparison-deep-learning-data-annotation?utm_source=linkedin&utm_medium=post&utm_campaign=blog-annotation-tool-comparison-deep-learning-data-annotation
  54. Diffgram,Label Studio ,CVAT,SuperAnnotate,Datasaur https://anthony-sarkis.medium.com/the-5-best-ai-data-annotation-platforms-for-machine-learning-2021-ec17c15142f3
  55. https://foobar167.medium.com/open-source-free-software-for-image-segmentation-and-labeling-4b0332049878
  56. ***Label Assist: Model Assisted Pre-Annotation for Computer Vision https://blog.roboflow.com/announcing-label-assist/ https://www.youtube.com/watch?v=919CihTlkZw&feature=youtu.be***
  57. https://github.com/jsbroks/awesome-dataset-tools
  58. makeml https://makeml.app/
  59. superannotate https://www.superannotate.com/
  60. jupyter-innotater data annotator for Jupyter notebooks https://github.com/ideonate/jupyter-innotater
  61. JupyterLab extension for annotating data https://github.com/explosion/jupyterlab-prodigy
  62. semi-auto-image-annotation-tool https://github.com/virajmavani/semi-auto-image-annotation-tool
  63. labelimage:- https://github.com/wkentaro/labelme , https://github.com/tzutalin/labelImg
  64. labelCloud lightweight tool for labeling 3D bounding boxes in point clouds https://github.com/ch-sa/labelCloud
  65. labeller https://www.labellerr.com/
  66. prodigy Radically efficient machine teaching An annotation tool powered by active learning https://prodi.gy/
  67. Labelbox-https://labelbox.com/
  68. Playment-https://playment.io/
  69. SuperAnnotate -https://www.superannotate.com/
  70. CVAT-https://github.com/openvinotoolkit/cvat
  71. Lionbridge- https://lionbridge.ai/
  72. LinkedAI: A No-code Data Annotations- https://analyticsindiamag.com/linkedai/
  73. Dataturks
  74. V7 Darwin The Rapid Image Annotator https://docs.v7labs.com/docs/loading-a-dataset-in-python https://github.com/v7labs/darwin-py#usage-as-a-python-library
  75. https://waliamrinal.medium.com/top-and-easy-to-use-open-source-image-labelling-tools-for-machine-learning-projects-ffd9d5af4a20
  76. https://github.com/heartexlabs/awesome-data-labeling
  77. Label a Dataset with a Few Lines of Code https://eric-landau.medium.com/label-a-dataset-with-a-few-lines-of-code-45c140ff119d
  78. https://analyticsindiamag.com/complete-guide-to-data-labelling-tools/ https://neptune.ai/blog/data-labeling-software
  79. Extraction of Objects In Images and Videos Using 5 Lines of Code https://towardsdatascience.com/extraction-of-objects-in-images-and-videos-using-5-lines-of-code-6a9e35677a31
  80. https://neptune.ai/blog/data-labeling-software?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-data-labeling-software
  81. 23)tensorflow_datasets as tfds https://www.tensorflow.org/datasets (import tensorflow_datasets as tfds)
  82. https://lionbridge.ai/datasets/tensorflow-datasets-machine-learning/
  83. 24)https://datasets.bifrost.ai/?ref=producthunt
  84. 25)https://ourworldindata.org/
  85. 26)https://data.worldbank.org/
  86. 27)google open images:https://storage.googleapis.com/openimages/web/download.html
  87. 30 Largest TensorFlow Datasets for Machine Learning https://lionbridge.ai/datasets/tensorflow-datasets-machine-learning/
  88. https://cloud.google.com/bigquery/public-data/ https://towardsdatascience.com/bigquery-public-datasets-936e1c50e6bc
  89. https://christopherzita.medium.com/how-to-download-google-images-using-python-2021-82e69c637d59
  90. 28)https://data.gov.in/
  91. 29)imagenet dataset-http://www.image-net.org/
  92. 30)https://parulpandey.com/2020/08/09/getting-datasets-for-data-analysis-tasks%e2%80%8a-%e2%80%8aadvanced-google-search/
  93. 31)https://storage.googleapis.com/openimages/web/index.html ,
  94. https://storage.googleapis.com/openimages/web/visualizer/index.html?set=train&type=segmentation&r=false&c=%2Fm%2F09qck
  95. https://console.cloud.google.com/marketplace/browse?filter=solution-type:dataset&_ga=2.35328417.1459465882.1589693499-869920574.1589693499
  96. https://catalog.data.gov/dataset?groups=education2168#topic=education_navigation
  97. https://vincentarelbundock.github.io/Rdatasets/datasets.html
  98. 32)coco dataset https://cocodataset.org/#explore
  99. 33)huggingface datasets-https://github.com/huggingface/datasets https://huggingface.co/datasets https://huggingface.co/languages
  100. pip install datasets
  101. 34)Big Bad NLP Database-https://datasets.quantumstat.com/
  102. fast.ai Datasets https://course.fast.ai/datasets
  103. https://github.com/niderhoff/nlp-datasets
  104. 600 NLP Datasets and Glory https://pub.towardsai.net/600-nlp-datasets-and-glory-4b0080bf5ab
  105. nlp-datasets https://github.com/karthikncode/nlp-datasets
  106. https://analyticsindiamag.com/15-most-important-nlp-datasets/ https://medium.com/ai-in-plain-english/25-free-datasets-for-natural-language-processing-57e407402c60
  107. 35)https://www.edureka.co/blog/25-best-free-datasets-machine-learning/
  108. 36)bigquery public dataset ,Google Public Data Explorer
  109. https://cloud.google.com/public-datasets https://guides.library.cmu.edu/machine-learning/datasets
  110. 37)inbuilt library data eg:iris dataset,mnist dataset,etc...
  111. pandas-datareader https://github.com/pydata/pandas-datareader
  112. tf.data.Datasets for TensorFlow Datasets
  113. 38)https://data.gov.sg/ https://data.gov.au/ https://data.europa.eu/euodp/en/data https://data.europa.eu/euodp/en/data https://data.govt.nz/
  114. data.gov.be ,data.egov.bg/ ,data.gov.cz/english ,portal.opendata.dk,govdata.de,opendata.riik.ee,data.gov.ie,data.gov.gr,datos.gob.es,data.gouv.fr,data.gov.hr
  115. dati.gov.it,data.gov.cy,opendata.gov.lt,data.gov.lv,data.public.lu,data.gov.mt,data.overheid.nl,data.gv.at,danepubliczne.gov.pl,dados.gov.pt,data.gov.ro,podatki.gov.si
  116. data.gov.sk,avoindata.fi,oppnadata.se,https://data.adb.org/ ,https://data.iadb.org/ ,https://www.weforum.org/agenda/2018/03/latin-america-smart-cities-big-data/
  117. https://data.fivethirtyeight.com/ , https://wiki.dbpedia.org/ ,https://www.europeandataportal.eu/en ,https://data.europa.eu/ ,https://www.census.gov/,
  118. https://www.who.int/data/gho ,https://data.unicef.org/open-data/ ,http://data.un.org/ ,https://data.oecd.org/ ,https://data.worldbank.org/
  119. 39.Awesome Public Dataset- https://github.com/awesomedata/awesome-public-datasets
  120. Get OpenMLs Dataset in One Line of Code https://mathdatasimplified.com/2021/04/23/fetch_openml-get-openmls-dataset-in-one-line-of-code/
  121. https://github.com/the-pudding/data
  122. datasets https://github.com/benedekrozemberczki/datasets
  123. kdnuggets https://www.kdnuggets.com/datasets/index.html
  124. Hub https://github.com/activeloopai/Hub
  125. 40.Datasets for Machine Learning on Graphs-https://ogb.stanford.edu/
  126. 41.https://www.johnsnowlabs.com/data/
  127. 42.30 largest tensorflow datasets-https://lionbridge.ai/datasets/tensorflow-datasets-machine-learning/
  128. 43. coco dataset-https://cocodataset.org/#home
  129. flickr-downloader https://github.com/renatoviolin/flickr-downloader/
  130. Google Open images-https://opensource.google/projects/open-images-dataset https://storage.googleapis.com/openimages/web/index.html
  131. 50+ Object Detection Datasets-https://medium.com/towards-artificial-intelligence/50-object-detection-datasets-from-different-industry-domains-1a53342ae13d
  132. 70+ Image Classification Datasets from different Industry domains-https://medium.com/towards-artificial-intelligence/70-image-classification-datasets-from-different-industry-domains-part-2-cd1af6e48eda
  133. VisualData Discovery https://www.visualdata.io/discovery https://guides.library.cmu.edu/machine-learning/datasets
  134. data https://storage.googleapis.com/openimages/web/visualizer/index.html?c=%2Fm%2F04yqq2&r=false&set=train&type=segmentation&utm_campaign=Weekly%20Machine%20Learning%20news&utm_medium=email&utm_source=Revue%20newsletter
  135. VisualData https://www.visualdata.io/discovery
  136. bifrost- https://datasets.bifrost.ai/
  137. satellite images https://towardsdatascience.com/finding-satellite-images-for-your-data-science-project-888695361925
  138. https://public.roboflow.com/
  139. https://www.visualdata.io/discovery http://www.image-net.org/ https://www.cs.toronto.edu/~kriz/cifar.html
  140. tensorflow_datasets.object_detection - https://storage.googleapis.com/openimages/web/index.html
  141. https://github.com/google-research-datasets/Objectron/ https://ai.googleblog.com/2020/11/announcing-objectron-dataset.html?m=1
  142. http://idd.insaan.iiit.ac.in/ http://database.mmsp-kn.de/koniq-10k-database.html
  143. https://ai.googleblog.com/2020/11/announcing-objectron-dataset.html
  144. https://www.visualdata.io/discovery https://blogs.bing.com/maps/2019-03/microsoft-releases-12-million-canadian-building-footprints-as-open-data
  145. https://blogs.bing.com/maps/2019-09/microsoft-releases-18M-building-footprints-in-uganda-and-tanzania-to-enable-ai-assisted-mapping
  146. https://datasets.bifrost.ai/ https://storage.googleapis.com/openimages/web/download.html https://computervisiononline.com/datasets http://yacvid.hayko.at/
  147. https://www.cogitotech.com/use-cases/biodiversity/
  148. ImageNet data -http://image-net.org/
  149. ApolloScape Dataset-http://apolloscape.auto/
  150. https://github.com/chrieke/awesome-satellite-imagery-datasets
  151. 44.https://github.com/fivethirtyeight/data
  152. 45.Recommender Systems Datasets-https://cseweb.ucsd.edu/~jmcauley/datasets.html
  153. 46.indiadataportal-https://indiadataportal.com/
  154. 47.US Government Open Dataset: https://www.data.gov/
  155. https://censusreporter.org/ https://data.census.gov/cedsci/
  156. 48.AWS Public Data Sets:https://registry.opendata.aws/ https://aws.amazon.com/opendata/?wwps-cards.sort-by=item.additionalFields.sortDate&wwps-cards.sort-order=desc
  157. 49.https://the-eye.eu/public/AI/pile_preliminary_components/
  158. Reddit -https://www.reddit.com/r/datasets/
  159. wikipedia-https://en.wikipedia.org/wiki/List_of_datasets_for_machine-learning_research
  160. http://opendata.cern.ch/ , https://www.imf.org/en/Data
  161. Global Health Observatory data repository-https://apps.who.int/gho/data/node.main
  162. CERN Open Data Portal-http://opendata.cern.ch/
  163. TensorFlow Datasets https://www.tensorflow.org/datasets
  164. 50.openblender- https://www.openblender.io/#/welcome
  165. 51.Top 10 Datasets For Cybersecurity Projects- https://analyticsindiamag.com/top-10-datasets-for-cybersecurity-projects/
  166. 52.Datasets from Web Crawl Data (nlp)-http://data.statmt.org/cc-100/
  167. 53.https://www.springboard.com/blog/free-public-data-sets-data-science-project/
  168. 54.NASA - https://nasa.github.io/data-nasa-gov-frontpage/ace
  169. 55.Academic Torrents,GitHub Datasets,CERN Open Data Portal,Global Health Observatory Data Repository
  170. 56.32 Data Sets to Uplift your Skills in Data Science-https://blog.datasciencedojo.com/data-sets-data-science-skills/?utm_content=144243072&utm_medium=social&utm_source=linkedin&hss_channel=lcp-3740012
  171. https://lionbridge.ai/datasets/the-50-best-free-datasets-for-machine-learning/
  172. 57.OpenDaL-https://opendatalibrary.com/
  173. Data Is Plural-https://docs.google.com/spreadsheets/d/1wZhPLMCHKJvwOkP4juclhjFgqIY8fQFMemwKL2c64vk/edit#gid=0
  174. VisualData-https://www.visualdata.io/discovery
  175. https://medium.com/towards-artificial-intelligence/best-datasets-for-machine-learning-data-science-computer-vision-nlp-ai-c9541058cf4f
  176. 58.Pandas Data Reader-https://pandas-datareader.readthedocs.io/en/latest/remote_data.html
  177. 59.ieee-dataport-https://ieee-dataport.org/datasets
  178. https://medium.com/towards-artificial-intelligence/best-datasets-for-machine-learning-data-science-computer-vision-nlp-ai-c9541058cf4f
  179. https://github.com/neomatrix369/awesome-ai-ml-dl/blob/master/data/datasets.md#datasets-and-sources-of-raw-data
  180. 60.Generating Realistic Fake Data https://towardsdatascience.com/free-resources-for-generating-realistic-fake-data-da63836be1a8
  181. Full Synthetic Data ,Partial Synthetic Data,Hybrid Synthetic Data
  182. Faker is a Python package that generates fake data-https://github.com/joke2k/faker
  183. ydata-synthetic,Gretel,gretel-synthetics,GenerateData,DataSynthesizer,SDV,SDGym,SDMetrics,Copulas,gretel-synthetics,kubric,CTGAN,Synthea,synthia,nbsynthetic ,pydbgen,synthpop,faker,Tonic,ydata,Mostly AI,Mirry.ai,Hazy,Gretel,Diveplane,Datagen,Mimesis,faker,FauxFactory,Radar,PikaAccelario,Chooch,Datagen,Datomize,Deep Vision Data,Monitaur,MOSTLY AI,OpenSynthetics,Replica Analytics,Scale AI,SKY ENGINE AI,Synthesis AI,Plaitpy,TimeseriesGenerat,Accelario,Chooch,dgutils,AI.Reverie,Kinetic Vision,SynthDet,OpenSynthetics,Mockaroo,GenerateData,JSON Schema Faker,FakeStoreAPI,Mock Turtle,nbsynthetic,AiFi,AI.Reverie,Anyverse,Cvedia,DataGen,Diveplane,Gretel,Hazy,Mostly AI,OneView,TRGD,YDATA Synthetic,SDV,Tonic.AI,Mostly.AI,Parallel Domain,Mindtech,Synthesis AI,Oneview,Hazy,CVEDIA,SKY ENGINE AI,Edgecase.ai,Statice,ANYVERSE,Rendered.ai,Datomize,Facteus,Gretel,Synthesized,Syntheticus,Syntho,Tonic, kubric,Stable Diffusion,Parallel Domain,Mindtech,Synthesis AI,Oneview,MOSTLY AI,Hazy,CVEDIA,SKY ENGINE AI,Edgecase.ai,Statice,ANYVERSE,Rendered.ai,Datomize,Facteus,Gretel,Synthesized,Syntheticus,Syntho,Tonic,MOSTLY AI, GenRocket, YData, Hazy, and MDClone ,Gretel, MOSTLY AI, Hazy, Statice ,NVIDIA Omniverse, OneView, CVEDIA, Datagen, Parallel Domain,Infinity AI,Parallel Domain,Rendered.AI,Scale.AI,SKY ENGINE AI,Synthesis AI,Paella,statice,DataSynthesizer,Pydbgen,TimeseriesGenerator,Mimesis,Synthesized,Syntheticus,Syntho,Tonic,Clearbox AI ,RDT (Reversible Data Transforms),DeepEcho
  184. Models: GANs, CTGAN, WGAN, WGAN-GP, VAEs,GANs, TimeGAN, AR
  185. GAN-based Deep Learning data synthesizer CTGAN,CopulaGAN,Synthetic Data Vault,Probabilistic AutoRegressive model
  186. Extract the metadata using DataDescriber, Compare the input and synthetic data using ModelInspector
  187. Mockaroo https://www.mockaroo.com/
  188. GenerateData https://site.generatedata4.com/
  189. JSON Schema Faker https://json-schema-faker.js.org/
  190. FakeStoreAPI https://fakestoreapi.com/
  191. graviti dataset https://gas.graviti.com/open-datasets
  192. Synthetic data for computer vision https://github.com/ZumoLabs/zpy
  193. GANs for Tabular Synthetic Data Generation https://github.com/Diyago/GAN-for-tabular-data
  194. Synthetic Image Datasets https://analyticsindiamag.com/unity-launches-synthetic-image-datasets-to-train-ai-models-faster/
  195. Synthetic structured data generators https://github.com/ydataai/ydata-synthetic
  196. gretel Synthetic Data API https://gretel.ai/
  197. Timeseries DGAN https://synthetics.docs.gretel.ai/en/latest/models/timeseries_dgan.html
  198. DatasetGAN: an automatic procedure to generate massive datasets of high-quality images
  199. Generating synthetic tabular data with GANs,Synthetic Time-Series Data by A GAN approach
  200. Unity Launches Synthetic Image Datasets https://www.marktechpost.com/2021/04/23/unity-launches-synthetic-image-datasets-to-train-ai-and-computer-vision-models-faster/
  201. Generate Your Own Dataset using GAN https://www.analyticsvidhya.com/blog/2021/04/generate-your-own-dataset-using-gan/
  202. accurate of synthetic data https://gretel.ai/blog/how-accurate-is-my-synthetic-data
  203. Synthetic data library https://github.com/finos/datahub https://github.com/agmmnn/awesome-blender https://opendata.blender.org/ https://www.youtube.com/watch?v=eZwOeBkLL8E
  204. https://www.kdnuggets.com/2019/09/scikit-learn-synthetic-dataset.html
  205. Fully Synthetic Data,Partially Synthetic Data ,Hybrid Synthetic Data https://towardsdatascience.com/synthetic-data-key-benefits-types-generation-methods-and-challenges-11b0ad304b55
  206. Synthetic Image Datasets https://analyticsindiamag.com/unity-launches-synthetic-image-datasets-to-train-ai-models-faster/ https://dockship.io/articles/607847e461373d1b994cc2dc/create-synthetic-images-using-opencv-(python)
  207. gretel-synthetics Synthetic data generators for structured and unstructured text, featuring differentially private learning. https://github.com/gretelai/gretel-synthetics
  208. Synthetic Data Generation Using Gaussian Mixture Model https://deepnote.com/@chanakya-vivek-kapoor/Synthetic-Data-Generation-QaaTRs73T2iCb0amHFbwpQ
  209. Synthetic Data Vault https://analyticsindiamag.com/guide-to-synthetic-data-vault-an-ecosystem-of-synthetic-data-generation-libraries/ https://github.com/sdv-dev/SDV
  210. Create Your own Image Dataset using Opencv https://www.analyticsvidhya.com/blog/2021/05/create-your-own-image-dataset-using-opencv-in-machine-learning/
  211. ydata-synthetic https://github.com/ydataai/ydata-synthetic
  212. Table Evaluator About Evaluate real and synthetic datasets with each other https://github.com/Baukebrenninkmeijer/table-evaluator
  213. evaluate quality and efficacy of synthetic datasets SDMetrics https://github.com/sdv-dev/SDMetrics
  214. 61.Text Data Annotator Tool - Datasaur https://datasaur.ai/
  215. Tagalog is our state-of-the-art solution for data management and labeling in Natural Language Processing https://www.tagalog.ai/tagalog/
  216. 62.Google Analytics cost data import https://segmentstream.com/google-analytics?utm_source=twitter&utm_medium=cpc&utm_campaign=ga_costs_import_en&utm_content=guide
  217. 63.https://lionbridge.ai/services/crowdsourcing/ https://lionbridge.ai/ https://www.clickworker.com/ https://appen.com/ https://www.globalme.net/
  218. 64.Azure Open Datasets https://azure.microsoft.com/en-us/services/open-datasets/ https://azure.microsoft.com/en-in/services/open-datasets/catalog/
  219. Yelp Open Dataset https://www.yelp.com/dataset
  220. https://data.world/
  221. ODK Open Data Kit- https://getodk.org/
  222. World Bank Open Data https://data.worldbank.org/
  223. https://analyticsindiamag.com/10-biggest-data-breaches-that-made-headlines-in-2020/
  224. https://data.mendeley.com/
  225. https://github.com/iamtekson/geospatial-data-download-sites
  226. https://eugeneyan.com/writing/data-discovery-platforms/
  227. 65.https://medium.com/towards-artificial-intelligence/best-datasets-for-machine-learning-data-science-computer-vision-nlp-ai-c9541058cf4f
  228. https://towardsdatascience.com/data-repositories-for-almost-every-type-of-data-science-project-7aa2f98128b
  229. https://github.com/MTG/freesound-datasets
  230. https://dataform.co/
  231. https://github.com/rfordatascience/tidytuesday https://www.youtube.com/watch?v=vCBeGLpvoYM
  232. https://www.analyticsvidhya.com/blog/2020/12/top-15-datasets-of-2020-that-every-data-scientist-should-add-to-their-portfolio/?utm_source=linkedin&utm_medium=AV|link|high-performance-blog|blogs|44181|0.375
  233. https://cseweb.ucsd.edu/~jmcauley/datasets.html
  234. 66.https://en.wikipedia.org/wiki/List_of_datasets_for_machine-learning_research
  235. https://archive.org/details/datasets
  236. https://commoncrawl.org/
  237. https://www.youtube.com/watch?v=1aUt8zAG09E
  238. 67. 6 Sources of Financial Data https://medium.datadriveninvestor.com/financial-data-431b75975bb
  239. yfinance for finance data using https://github.com/ranaroussi/yfinance https://medium.com/towards-artificial-intelligence/algorithmic-trading-with-python-and-machine-learning-part-1-47c56706c182
  240. import fix_yahoo_finance as yf , yahoofinancials ,Pandas DataReaders,Twelve Data
  241. financeapi https://towardsdatascience.com/pull-and-analyze-financial-data-using-a-simple-python-package-83e47759c4a7
  242. Investing.com pip install investpy ,Kite by Zerodha pip install kiteconnect,quandl pip install quandl
  243. https://www.analyticsvidhya.com/blog/2021/01/bear-run-or-bull-run-can-reinforcement-learning-help-in-automated-trading/?utm_source=feedburner&utm_medium=email&utm_campaign=Feed%3A+AnalyticsVidhya+%28Analytics+Vidhya%29
  244. Downloading Historical Stock prices with Alpha Vantage https://medium.com/towards-artificial-intelligence/downloading-historical-stock-prices-with-alpha-vantage-688edad46a6d
  245. Pandas Datareader https://pandas-datareader.readthedocs.io/en/latest/ https://www.youtube.com/watch?v=f2BCmQBCwDs
  246. Get Financial Data Directly into Python https://www.quandl.com/tools/python https://medium.com/nerd-for-tech/how-to-get-financial-data-using-python-7a508f25fc39
  247. openml https://www.openml.org/search?type=data
  248. https://registry.opendata.aws/
  249. voice_datasets https://github.com/jim-schwoebel/voice_datasets
  250. Dynamically-Generated-Hate-Speech-Dataset https://github.com/bvidgen/Dynamically-Generated-Hate-Speech-Dataset
  251. 68.DOCANO, an open source text annotation tool https://github.com/doccano/doccano
  252. 69.https://www.dataquest.io/blog/free-datasets-for-projects/
  253. 70.audio set https://research.google.com/audioset/
  254. 71.FlatData Flat explores how to make it easy to work with data in git and GitHub https://octo.github.com/projects/flat-data?utm_campaign=Data_Elixir&utm_source=Data_Elixir_337
  255. 72.Snorkel is an open-source Python library for programmatically building training datasets without manual labeling. https://www.snorkel.org/ https://towardsdatascience.com/snorkel-programmatically-build-training-data-in-python-712fc39649fe

2.Feature engineering

https://towardsdatascience.com/practical-code-implementations-of-feature-engineering-for-machine-learning-with-python-f13b953d4bcd

Feature-engine https://trainindata.medium.com/feature-engine-a-new-open-source-python-package-for-feature-engineering-29a0ab88ea7c https://feature-engine.readthedocs.io/en/latest/ https://github.com/solegalli/feature_engine https://www.datasciencecentral.com/profiles/blogs/feature-engine-python-package-for-feature-engineering

Automated feature engineering https://medium.com/ibm-data-ai/automated-feature-engineering-for-relational-data-with-autoai-3612fafe9f89

Automated Data Wrangling https://catalyst.coop/2021/05/23/automated-data-wrangling/

Automatic Feature Engineering Using Featurewiz https://towardsdatascience.com/automate-your-feature-selection-workflow-in-one-line-of-python-code-3d4f23b7e2c4 https://github.com/AutoViML/featurewiz

Automatic Feature Engineering Using AutoFeat https://analyticsindiamag.com/guide-to-automatic-feature-engineering-using-autofeat/

Upgini accuracy improving features https://github.com/upgini/upgini https://upgini.com/

Categorical Encoding https://github.com/scikit-learn-contrib/category_encoders

lazytransform https://github.com/AutoViML/lazytransform

Streamlining Feature Engineering Pipelines with Feature-engine https://towardsdatascience.com/streamlining-feature-engineering-pipelines-with-feature-engine-e781d551f470 https://feature-engine.readthedocs.io/en/latest/#

Validate your Data (Schema) https://towardsdatascience.com/introduction-to-schema-a-python-libary-to-validate-your-data-c6d99e06d56a

Validate Your pandas DataFrame with Pandera https://github.com/pandera-dev/pandera

Statistical DataFrame Testing Toolkit https://pandera.readthedocs.io/en/stable/index.html

Data storing format:Pickle,Parquet,Feather,Avro,ORC

Data cleaning-Pyjanitor-https://analyticsindiamag.com/beginners-guide-to-pyjanitor-a-python-tool-for-data-cleaning/

data cleaning library https://www.analyticsvidhya.com/blog/2021/05/data-cleaning-libraries-in-python-a-gentle-introduction/

Mage https://github.com/mage-ai/mage-ai

Cleaner Data Analysis with Pandas Using Pipes https://towardsdatascience.com/cleaner-data-analysis-with-pandas-using-pipes-4d73770fbf3c

DataPrep https://dataprep.ai/ https://github.com/sfu-db/dataprep https://towardsdatascience.com/dataprep-v0-3-0-has-been-released-be49b1be0e72

Dora (pip library) - data cleaning

Dora,PrettyPandas,DataCleaner,Tabulate,Pyjanitor,OpenRefine,cleanlab,pandera

https://github.com/sfu-db/dataprep https://github.com/akanz1/klib https://www.bitrook.com/ https://github.com/rhiever/datacleaner https://github.com/johnkerl/miller

cleanlab data-centric AI and machine learning with label errors, finding mislabeled data, and uncertainty quantification. Works with most datasets and models https://github.com/cleanlab/cleanlab

cleantext https://www.youtube.com/watch?v=i2TjAgga1YU&feature=youtu.be

CleanText: A Python Package to Clean Raw Text Data https://analyticsindiamag.com/guide-to-cleantext-a-python-package-to-clean-raw-text-data/

ATOM https://github.com/tvdboom/ATOM https://towardsdatascience.com/how-to-test-multiple-machine-learning-pipelines-with-just-a-few-lines-of-python-1a16cb4686d

openrefine A free, open source, powerful tool for working with messy data https://openrefine.org/#

data leaning library https://www.analyticsvidhya.com/blog/2021/05/data-cleaning-libraries-in-python-a-gentle-introduction/

https://machinelearningmastery.com/basic-data-cleaning-for-machine-learning/

Speed Up Data Cleaning and Exploratory Data Analysis in Python with klib https://github.com/akanz1/klib https://towardsdatascience.com/speed-up-your-data-cleaning-and-preprocessing-with-klib-97191d320f80

missingno https://github.com/ResidentMario/missingno

Take the Pain Out of Data Cleaning for Machine Learning https://towardsdatascience.com/take-the-pain-out-of-data-cleaning-for-machine-learning-20a646a277fd

dabl https://ms-bharti.medium.com/jump-start-your-supervised-learning-task-with-dabl-e479323e81fe

Easy to use Python library of customized functions for cleaning and analyzing data https://github.com/akanz1/klib

PyOD https://pyod.readthedocs.io/en/latest/ https://github.com/yzhao062/pyod/blob/development/docs/index.rst https://towardsdatascience.com/how-to-detect-outliers-with-python-pyod-aa7147359e4b

Amazon’s New Visual Data Cleaning Tool Can Speed Up Your AI Projects https://medium.com/dataseries/how-amazons-new-visual-data-tool-can-speed-up-your-ai-projects-68e3289382c

Featuretools https://www.featuretools.com/ https://towardsdatascience.com/why-automated-feature-engineering-will-change-the-way-you-do-machine-learning-5c15bf188b96

https://github.com/alteryx/featuretools https://analyticsindiamag.com/introduction-to-featuretools-a-python-framework-for-automated-feature-engineering/

Feature Selection using Genetic Algorithm https://github.com/kaushalshetty/FeatureSelectionGA

AutoFeat https://analyticsindiamag.com/guide-to-automatic-feature-engineering-using-autofeat/ https://github.com/cod3licious/autofeat

feast Feature Store for Machine Learning https://github.com/feast-dev/feast https://www.youtube.com/watch?v=ZeJdr0nZ9PA

Category Encoders https://contrib.scikit-learn.org/category_encoders/

Feature-engine https://feature-engine.readthedocs.io/en/latest/index.html

FeatureTools,AutoFeat,TsFresh,Cognito,OneBM,ExploreKit,PyFeat,Category Encoders,Feature-engine

Automated Feature Selection: Featurewiz https://github.com/AutoViML/featurewiz https://towardsdatascience.com/featurewiz-fast-way-to-select-the-best-features-in-a-data-9c861178602e

zoofs a Python library for performing feature selection https://github.com/jaswinder9051998/zoofs

Feature Engineering of DateTime Variables for Data Science, Machine Learning https://www.kdnuggets.com/2021/04/feature-engineering-datetime-variables-data-science-machine-learning.html

NeatText a simple NLP package for cleaning textual data and text preprocessing https://github.com/Jcharis/neattext

Remove duplicate data in dataset,Data validity check,Contaminated Data,Inconsistent Data,Invalid Data,

Feature Selection

1.Removal of arbitraty features: DropFeatures

Removing unused columns,Removing Constant features,Removing Constant Features using VarianceThreshold,Removing Quasi-Constant Features,Removing Duplicate Columns

2.Removal of constant and almost constant features: DropConstantFeatures

Removal of Low Variance

removal of irrelevant data

3.Removal of duplicated variables: DropDuplicateFeatures

4.Removal of correlated features: DropCorrelatedFeatures, SmartCorrelatedSelection

Drop features that have a poor correlation with the response variable

5.Selection of features by value shuffling: SelectByShuffling

Selection of features by High correlation with the target variable

6.Selection of features by univariate performance: SelectBySingleFeaturePerformance

7.Selection of features by target encoding: SelectByTargetMeanPerformance

8.Recursive Feature Elimination: RecursiveFeatureElimination

9.Recursive Feature Addition: RecursiveFeatureAddition

stats,Scipy,Pingouin,Statsmodels,SymPy,Sage,

StatisticsGen component computes statistics

Check data types , Handle duplicate values

a.Handle missing value

  1. Types of missing value https://datamuni.com/@atsunorifujita/missing-value-imputation-using-datawig
  2. Handling Missing Values in Pandas https://pub.towardsai.net/handling-missing-values-in-pandas-f87cec928937
  3. Identify the source of missing data
  4. i.missing completely at random(no correlation b/w missing and observed data) we can delete no disturbance of data distribution
  5. ii.missing at random (randomness in missing data, missing value have correlation by data) we can't delete because disturbance of data distribution
  6. iii.missing not at random (there is reason for missing value and directly related to value)
  7. iv.structured missing 100 % sure on why it is missing
  8. Identify Missingness Types With Missingno https://towardsdev.com/how-to-identify-missingness-types-with-missingno-61cfe0449ad9
  9. Univariate,Multivariate https://medium.com/fintechexplained/what-are-imputers-in-data-science-b72f8308322b
  10. univariate imputation impute on 1 column multi variate imputation impute on 1 or more column
  11. 1.if missing data too small then delete it a.row deletion b.column deletion c.pairwise deletion and listwise deletion
  12. Drop based on a threshold value,Drop using a subset of columns
  13. 2.replace by statistical method mean(influenced by outiler),median(not influenced by outiler),mode , minimum, maximum,Zero,Constant
  14. Fill with Mean / Median of Column or Group Forward Fill or Forward Fill within Groups
  15. Mean and Median Fill with Groupby
  16. Pass another DataFrame to fillna function to fill up the missing values.
  17. Similar case Imputation
  18. 3.apply classifier algorithm to predict missing value
  19. Using Algorithms that support missing values
  20. Imputation using Deep Learning Library — Datawig https://github.com/awslabs/datawig
  21. 4.Simple Imputer,and Multiple Imputation ,Iterative imputer,knn imputer, multivariate imputation, Verstack — NaNImputer,Impyute —MICE ,Substitution
  22. 5.apply unsupervised
  23. 6.Random Imputation,Iterative Imputation,Random Sample imputation
  24. 7.Adding a variable to capture NAN(missing term),Imputation with the string ‘Missing’,Adding missing idicator
  25. 8.Arbitrary Value Imputation
  26. TREAT MISSING VALUES AS A SEPARATE CATEGORY
  27. ue DOMAIN KNOWLEDGE
  28. 9.hot deck Imputation,Cold deck imputation
  29. 10.regression Imputation,Stochastic Regression Imputation,Interpolation and Extrapolation
  30. 11.End of Distribution Imputation
  31. 12.Arbitrary Value Imputation
  32. 13.Frequent Category Imputation
  33. 14.MICE Imputation,miceforest ( https://github.com/AnotherSamWilson/miceforest )
  34. Miss Forest https://github.com/stekhoven/missForest
  35. 15.interpolation https://www.analyticsvidhya.com/blog/2021/06/power-of-interpolation-in-python-to-fill-missing-values/ Interpolate or Interpolate within Groups
  36. LINEARINTERPOLATION ,POLYNOMIALINTERPOLATION,INTERPOLATION THROUGH PADDING
  37. Extrapolation and Interpolation ,Time-Based Interpolation,Spline Interpolation,Linear Interpolation,Smoothing, interpolation,Bidirectional Recurrent Imputation for Time Series (
  38. 16.Last Observation Carried Forward (LOCF) , Next Observation Carried Backward , Rolling Statistics, Interpolation
  39. Single and Multiple Imputation,Univariate Imputation,Multivariate Imputation ,Iterative Imputer,MissForest Imputation,Stochastic Regression Imputation, Multiple Imputations, Datawig, Hot-Deck imputation, Extrapolation, Interpolation
  40. datawig Imputation of missing values in tables https://github.com/awslabs/datawig
  41. Imputation using K-NN,missForest,Random Forest-based Imputation,missingpy,som,Ann,mlp
  42. Model based procedure gaussian mixture model
  43. Imputation Using Deep Learning (Datawig),neural network for imputation,BRITS
  44. 15.autoimpute-https://github.com/kearnz/autoimpute
  45. 16.bfill / ffill Back Fill or Back Fill within Groups
  46. 17.Adding a variable to capture NAN
  47. 18.replace NAN with a new category
  48. 19.Missing indicator
  49. After drop or imputation feature distribution should be same
  50. https://www.kdnuggets.com/2021/05/deal-with-categorical-data-machine-learning.html
  51. https://towardsdatascience.com/6-different-ways-to-compensate-for-missing-values-data-imputation-with-examples-6022d9ca0779
  52. https://stefvanbuuren.name/fimd/want-the-hardcopy.html https://www.datasciencecentral.com/profiles/blogs/how-to-treat-missing-values-in-your-data-1
  53. 20.Imputation with the string ‘Missing’ ,Addition of binary missing indicators
  54. 21.Algorithms robust to missing values - LightGBM
  55. datawig imputation https://github.com/awslabs/datawig
  56. 22.Cluster-based approach for missing value imputation Naive clustering,Column-sensitive clustering
  57. Top Data Cleaning Tools https://www.marktechpost.com/2022/02/20/top-data-cleaning-tools-for-data-science-and-machine-learning-projects-in-2022/
  58. OpenRefine https://openrefine.org/ https://github.com/OpenRefine/OpenRefine
  59. Data Ladder https://dataladder.com/
  60. re-data fix data issues https://github.com/re-data/re-data
  61. Automatically find and fix errors in your ML datasets. https://github.com/cleanlab/cleanlab
  62. Clean APIs for data cleaning https://github.com/pyjanitor-devs/pyjanitor
  63. datacleaner https://github.com/rhiever/datacleaner
  64. https://github.com/akanz1/klib https://pyjanitor-devs.github.io/pyjanitor/ https://dataprep.ai/ https://scrubadub.readthedocs.io/en/latest/index.html https://www.bitrook.com/
  65. AutoClean https://github.com/elisemercury/AutoClean
  66. Dora,PrettyPandas,DataCleaner,Tabulate,Pyjanitor

b.Handle imbalance Collect More Data if possible,Try Resampling Your Dataset

  1. 1.Under Sampling - mostly not prefer because lost of data imbalaced-learn,tomek links,Random Under-Sampling, Edited Nearest Neighbours,NearMiss
  2. Random majority under-sampling with replacement,Tomek Links Undersampling,Under-sampling with Cluster Centroids,Condensed Nearest Neighbour,One-Sided Selection,Neighboorhood Cleaning Rule,One-Sided Selection,
  3. 2.Over Sampling (RandomOverSampler (here new points create by same dot)) , SMOTETomek(new points create by nearest point so take long time),BorderLine Smote,Borderline-SMOTE SVM,FAIR SMOTE,DBSMOTE,SMOTE-ENN ,KMeans Smote,SVM Smote,SMOTe NC,ENNSMOTE,SVMSMOTE,MOTE-N ADASYN,ADASYN,Smote-NC,Random Over Sampling,RandomUnderSampler,SMOTEN,SMOTE-Tomek,SMOTE-ENN,SMOTE-CUT,Cluster-Based Over Sampling, Informed Over Sampling,MSMOTE,Oversampling Using Gaussian Mixture Models,SMOTE + Tomek Links, SMOTE + ENN,Crucio SMOTEENN,NearMiss,OSS & NCR under sampling,Borderline SMOTE KNN,Borderline SMOTE SVM,Adaptive Synthetic Sampling (ADASYN),BalancedBaggingClassifier() , BalancedRandomForestClassifier SMOTE-NC
  4. Over-sampling followed by under-sampling : SMOTE + Tomek links,SMOTE + ENN
  5. smote_variants https://github.com/analyticalmindsltd/smote_variants
  6. https://towardsdatascience.com/5-smote-techniques-for-oversampling-your-imbalance-data-b8155bdbe2b5
  7. https://www.analyticsvidhya.com/blog/2017/03/imbalanced-data-classification/
  8. ensmble based -Bagging Based techniques, Boosting-Based techniques,Adaptive Boosting- Ada Boost techniques,Gradient Tree Boosting,XG Boost
  9. tools Imb-learn,SMOTE-Variants,Regression-ReSampling https://towardsdatascience.com/tools-to-handle-class-imbalance-bff20c3bf099
  10. Balancing data sets with Crucio ADASYN https://medium.com/softplus-publication/balancing-data-sets-with-crucio-adasyn-79f04ff0779d
  11. LoRAS: A Better Oversampling Algorithm https://analyticsindiamag.com/hands-on-guide-to-loras-a-better-oversampling-algorithm/ https://github.com/narek-davtyan/LoRAS
  12. https://towardsdatascience.com/7-over-sampling-techniques-to-handle-imbalanced-data-ec51c8db349f
  13. Combining Over and Under-sampling
  14. 3.class_weight give more importance(weight) to that small class ( Cost-Sensitive Algorithms)
  15. from sklearn import compute_class_weight
  16. Cost-sensitive learning,Class-balanced loss,Focal loss
  17. weighted loss function
  18. 4.use Stratified kfold to keep the ratio of classess constantly, train teat spilt startify attribute
  19. Use K-fold Cross-Validation in the Right Way,Stratified Cross Validation,repeated K-fold Cross-Validation,Stratified K-fold Cross-Validation
  20. Stratified Sampling,Stratified splits
  21. 5.Weighted Neural Network
  22. cluster based sampling
  23. 6.MESA https://analyticsindiamag.com/guide-to-mesa-boost-ensemble-imbalanced-learning-with-meta-sampler/
  24. 7.choose Proper Evaluation Metric metric roc,f1,etc...
  25. https://machinelearningmastery.com/framework-for-imbalanced-classification-projects/ https://www.kdnuggets.com/2020/01/5-most-useful-techniques-handle-imbalanced-datasets.html
  26. 8.Deep Imbalanced Regression https://github.com/YyzHarry/imbalanced-regression https://analyticsindiamag.com/deep-imbalanced-regression-complete-guide/
  27. Imbalanced Dataset Sampler https://github.com/ufoym/imbalanced-dataset-sampler
  28. 9.Ensemble Techniques ensemble techinque - Bagging Based techniques,Boosting-Based techniques
  29. BalancedBaggingClassifier,Threshold moving,Easy Ensemble classifier,Balanced Random Forest,Balanced Bagging,RUSBoost,MESA
  30. 10.Try Different Algorithms (ensemble techinque - Bagging Based techniques,Boosting-Based techniques)
  31. model based (some models are particularly suited for imbalanced dataset)
  32. Algorithmic Ensemble Techniques,Tree-Based Algorithms
  33. 11.Try a Different Perspective ( consider as anomaly detection or change detection)
  34. Threshold Moving Methods,One-Class Classification,Customised Ensemble Algorithms
  35. Probability Tuning Algorithms,Calibrating Probabilities,Tuning the Classification Threshold
  36. 12.databalancer https://github.com/pradeepdev-1995/databalancer
  37. 13.collect more data
  38. 14.treat problem as anomaly detection
  39. 15.Combined Class Methods
  40. In this type of method, various methods are fused together to get a better result to handle imbalance data. For instance, like SMOTE can be fused with other methods like MSMOTE (Modified SMOTE), SMOTEENN (SMOTE with Edited Nearest Neighbours), SMOTE-TL, SMOTE-EL, etc. to eliminate noise in the imbalanced data sets
  41. 16.One-Class Algorithms,One-Class Support Vector Machines,Isolation Forests,Minimum Covariance Determinant,Local Outlier Factor,Mahalanobis Distance for One Class Classification
  42. 17.BalancedBatchGenerator https://imbalanced-learn.org/stable/references/generated/imblearn.keras.BalancedBatchGenerator.html
  43. 18.train_test_split stratify attribute , stratify split
  44. 19. https://github.com/pradeepdev-1995/databalancer
  45. Metas balance package https://github.com/facebookresearch/balance

c.Remove noise data

d.Format data

d.Discretize
a.Equal width binning
b.Equal frequency binning
c.K-means Binning
d.Discretization by Decision Trees
e.ChiMerge
f.Arbitrary Discretization
g.Quantile
h.Custom Discretization

  1. Discretisation plus categorical encoding,Discretisation plus encoding Discretisation with classification trees,Domain knowledge discretisation
  2. Data Binning
  3. Binning based on distribution (quantile-cut),Binning based on values (cut)
  4. Bucketing , quantile bucketing ,Clipping

e.Handle categorical data Ordinal,Nominal,cyclic,binary categorical variables

  1. 1.One Hot Encoding , dummy, and effect coding,Similarity Encoding,Binary Encoding
  2. Rainbow Method for Label Encoding
  3. 2.Count Or Frequency Encoding
  4. 3.Ordinal encoding,Nominal Encoding,Monotonic ordinal encoding,Target Guided Ordinal Encoding,Target Guided Mean Encoding,Target-Mean-Encoding
  5. 4.Target encoding / Mean encoding,GapEncoder,MinHashEncoder,Target guided ordinal encoding,Bayesian Target Encoding
  6. Target Encoding,K-Fold Target Encoding,Leave-One-Out Target Encoding,Leave One fold out Target Encoding,Target Encoding with a Weighted Mean
  7. 5.Probability Ratio Encoding,Rank Encoding,Polynomial Encoding,Backward Difference Encoding
  8. 6.label encoding or .cat.codes ,Label Encoding with Rainbow Method
  9. 7.probability ratio encoding
  10. 8.woe(Weight_of_evidence)
  11. Word2Vec(word Word embedding)
  12. 9.one hot encoding with multi category (keep most frequently repeated only) (One hot encoding of top categories)
  13. 10.feature hashing,CatBoost Encoding
  14. 11.sparse csr matrix
  15. 12.entity embeddings,Categorical Embeddings
  16. 13.binary encoding,Base-N Encoding
  17. 14.Rare label encoding
  18. 15.Leave-one-out(Loo) encoding,Generalized Linearn Mixed Model
  19. 16.hash encoding,MinHashEncoder,SimilarityEncoder,DatetimeEncoder,SuperVectorizer,FeatureHasher,DictVectorizer,HashingVectorizer,DecisionTreeEncoder
  20. 17.dummy encoding,NaN Encoding,bin counting scheme,effect coding scheme
  21. 18.Helmert Encoding,Backward Difference Encoding,James-Stein Encoding,M-estimator Encoding,Thermometer Encoder,Bayesian Encoders,Effect Encoding
  22. Helmert Encoding,Base N Encoding,Hash Encoding,Effect or Sum or Deviation Encoding,Backward Difference Encoding,M-Estimator Encoding,James- Stein Encoding,Thermometer Encoding,CatBoost Encoding,Backward Difference Encoding,Binary Encoding,NaN encoding Polynomial encoding,Expansion encoding,Probability Ratio,Binary encoding,cat boost encoder,glm encoder,m-estimte,sum coding, polynomial Encoding,PRatioEncoder,DecisionTreeEncoder,Similarity Encoding,BackwardDifferenceEncoder GapEncoder,MinHashEncoder,TargetEncoder,Polynomial Encoding,James-Stein Encoding,MultiLabelBinarizer,SumEncoder,Quantile Encoder,Summary Encoder ,Base N Coding,Leaf Encoding,GLMM Encoding,James-Stein Encoding,Thermometer Encoding,Quantile Encoding,Summary Encoding,Collapsing Categories
  23. Transform your categorical columns with imperio SmoothingTransformer
  24. entity encoder for categorical variable https://contrib.scikit-learn.org/category_encoders/
  25. Automatically selects the best encoder https://github.com/dirty-cat/dirty_cat
  26. Improve ML Model Performance by Combining Categorical Features https://towardsdatascience.com/improve-ml-model-performance-by-combining-categorical-features-a23efbb6a215
  27. https://towardsdatascience.com/beyond-one-hot-17-ways-of-transforming-categorical-features-into-numeric-features-57f54f199ea4
  28. https://towardsdatascience.com/how-to-encode-categorical-data-d44dde313131 https://towardsdatascience.com/python-for-finance-7-useful-libraries-that-you-should-know-e422b9e9aaba

f.Scaling of data

  1. 1.Normalisation
  2. 2.Standardization(StandardScaler)
  3. 3.Robust Scaler not influenced by outliers because using of median,IQR
  4. 4.Min Max Scaling
  5. 5.Mean normalization
  6. 6.maximum absolute scaling
  7. 7.Power Transformer Scaler
  8. 8.Scaling To Median And Quantiles,Scaling to minimum and maximum values,Scaling to the vector norm
  9. 9.unit vector scaler
  10. 10.Z-score standardization
  11. https://www.analyticsvidhya.com/blog/2020/07/types-of-feature-transformation-and-scaling/?utm_source=linkedin&utm_medium=KJ|link|high-performance-blog|blogs|44204|0.375

Probability and Statistics Packages : PyMC3, tensorflow-probability,Pyro,GPyTorch,hmmlearn,pomegranate,GPflow,patsy,pingouin,Orbit

Q-Q plot or Shapiro-Wilk Normality Test or lilliefors test or Jarque-Bera test or Kolmogorov-Smirnov or Anderson-Darling test is used to check whether feature is guassian or normal distributed required for linear regression,logistic regression to Improve

performance if not distributed then use below methods to bring it guassian distribution

normal test,Histogram,Q-Q plot,KDE plot,Skewness and Kurtosis for check normal distribution

Fitter Library Finding the Best Distribution that Fits Your Data https://towardsdatascience.com/finding-the-best-distribution-that-fits-your-data-using-pythons-fitter-library-319a5a0972e9

anderson teset use for check any distribution

Basic Distributions - PDF, PMF, CDF, PPF,Unform, Gaussian, Bernoulli, Multinomial,Normal Distribution,Poisson, Exponential, Geometric, Log-normal distribution, Pareto/Power Law Distribution

  1. b.Logarithmic Transformation,LogCpTransformer
  2. c.Reciprocal Trnasformation
  3. d.Square Root Transformation
  4. e.Exponential Transdormation
  5. f.BoxCOx and Yeo-Johnson Transformation
  6. g.log(1+x) Transformation
  7. h.johnson
  8. i.power transformations https://towardsdatascience.com/when-and-how-to-use-power-transform-in-machine-learning-2c6ad75fb72e
  9. g.Quantile Transformation ,Arcsin Transformation , Inverse of Log,Inverse of Exponential,Inverse of Square Root,Square of Log,Square root of Exponential
  10. Root transformation,Cube root transformation,Cosine Transformation,SplineTransformer,FunctionTransformer,ArcsinTransformer
  11. Left skewness (use powers) Squares transformation,Cubes transformation,High powers

g.Remove low variance feature by using VarianceThreshold

remove Duplicate data,Low variation data,Irrelevant data,Incorrect data

remove Low entropy of categorical attributes

h.Same variable(only 1 variable) in feature then remove feature

i.Outilers removing outilers depond on problem we are solving https://github.com/jainyk/package-outlier

  1. 2 type of outilers available: Global outiler(single value/data point that deviates from the distribution), Local outiler,Contextual (conditional) outliers,Collective outliers(Group of datapoint deviates from the distribution)
  2. eg: incase of fraud detection outilers are very important
  3. methods to find outiler: Tukeys fences ,KNN distance,Autoencoders,Standard Deviation,zscore,boxplot,scatter plot,histogram,Violin Plot,IQR,TensorFlow_Data_Validation,svm,One-Class SVM,Isolation Forest,kmeans,DBSCAN,K Means Clustering,Percentile,knn,autoencoder,local outiler factor,One-Class Classification,Medıan Absolute Devıatıon
  4. Automatic Outlier Detection:Isolation Forest,DBSCAN,Local Outlier Factor,Standard Deviation Approach,K Means Clustering,Minimum Covariance Determinant,Robust Random Cut Forest,DBScan Clustering,One-Class Classification,One-Class SVM,Autoencoder,Outlier Detection using In-degree Number,Histogram-based Outlier Detection,Robust Covariance,PyNomaly,angle-based outlier detection (ABOD),k-Nearest Neighbors Detector,Elliptic Envelope,Cluster-based,Local Outlier Factor,Histogram-based Outlier Detection
  5. outiler treatment: Keep them,mean/median/random imputation,drop,discretization (binning),Winsorization,treat as seperate group,replace with resperctive percentiles,standardize and scale the data,transformation(log,scaling,sqrt,power),Replace the outlier values with a suitable value (Like 3rd deviation),Percentile Based Flooring and Capping,Binning,Trimming,Treating outliers as missing values,Top/bottom/zero coding,winsorizing,robust scaler,log transformation,binning,regularisation,Discretization,arbitrary value
  6. Outlier capping with IQR Outlier capping with mean and std Outlier capping with quantiles Arbitrary capping
  7. Separation: If the amount of the outlier is higher than the normal then we can separate them from the main data and fit the model on them separately
  8. Use a Different algorithm that is not sensitive to outliers
  9. Segment data so outliers are in a separate group
  10. Weighted means (which put more weight on the normal part of the distribution)
  11. Trimming: Remove outliers from dataset. However, it can remove large proportion of data.
  12. Capping: No data is removed. However, it distorts variable distribution.
  13. Missing data: The outliers are treated as missing data.
  14. Discretization: The outliers are put into lower and upper bins.
  15. Arbitrary capping: Domain knowledge of the variable is required to cap the min and max
  16. Winsorization: Truncate or cap extreme values to reduce the impact of outliers
  17. Transformation: Apply logarithmic or square root transformations
  18. Modeling techniques: Use robust regression or tree-based models
  19. Outlier removal: Remove the values with careful consideration if they pose an extreme challenge
  20. Separate Analysis : This involves performing separate analyses for the data with and without outliers
  21. Flagging : Create an additional variable to indicate outliers, providing transparency about their presence in the dataset.
  22. ML model which are not sensitive to outliers Like:-KNN,Decision Tree,SVM,NaïveBayes,Ensemble
  23. PyOD: A Python Toolkit For Outlier Detection https://analyticsindiamag.com/guide-to-pyod-a-python-toolkit-for-outlier-detection/
  24. TODS: An Automated Time-series Outlier Detection System https://github.com/datamllab/tods https://towardsdatascience.com/tods-detecting-outliers-from-time-series-data-2d4bd2e91381
  25. anomalib anomaly detection library https://github.com/openvinotoolkit/anomalib
  26. if outiler present then use robust scaling
  27. alibi-detect https://github.com/SeldonIO/alibi-detect#adversarial-detection https://docs.seldon.io/projects/alibi-detect/en/latest/
  28. https://medium.com/towards-artificial-intelligence/outlier-detection-and-treatment-a-beginners-guide-c44af0699754
  29. https://towardsdatascience.com/two-outlier-detection-techniques-you-should-know-in-2021-1454bef89331

j.Anomaly anomaly-detection-resources https://github.com/yzhao062/anomaly-detection-resources

  1. Types of Anomalies : Point anomalies,Contextual anomalies,Collective anomalies,Group Anomalies,Spatial Anomalies,Temporal Anomalies
  2. clustering techniques to find it
  3. Timetk https://towardsdatascience.com/timetk-the-r-library-for-time-series-analysis-9822f7720318
  4. Isolation Forest(for Big Data),Z score,dbscan,Local Outlier Factor,One-Class Support Vector Machine,Autoencoders,knn,Time Series Analysis,Elliptic EnvelopeInterquartile Range,Median Absolute Deviation,K-Nearest Neighbours,Fast-MCD,Auto Encoders,K-means,Histogram-based,pca,K-means,Gaussian Mixture Model,Autoencoder,Hidden Markov Models (HMM)
  5. 𝐏𝐲𝐎𝐃
  6. Local Correlation Integral (LCI),Histogram-based Outlier Detection (HBOS),Angle-based Outlier Detection (ABOD),Clustering-Based Local Outlier Factor (CBLOF),Minimum Covariance Determinant (MCD),Stochastic Outlier Selection (SOS),Spectral Clustering for Anomaly Detection (SpectralResidual),Feature Bagging,Average KNN,Connectivity-based Outlier Factor (COF),Variational Autoencoder (VAE)
  7. bootstrapping to remove the influence of the outlier data
  8. Anomaly detection using PyOD https://pyod.readthedocs.io/en/latest/ https://www.youtube.com/watch?v=QPjG_313GOw https://github.com/yzhao062/pyod https://pyod.readthedocs.io/en/latest/pyod.models.html
  9. ADBench https://github.com/Minqi824/ADBench
  10. Anomaly Detection Pyfbad https://github.com/Teknasyon-Teknoloji/pyfbad
  11. divided into three types:Point/Global Anomalies,Collective Anomalies,Contextual Anomalies https://towardsdatascience.com/a-comprehensive-beginners-guide-to-the-diverse-field-of-anomaly-detection-8c818d153995
  12. https://github.com/zhuyiche/awesome-anomaly-detection
  13. https://medium.com/@ODSC/data-sciences-role-in-anomaly-detection-bd976f93d7e3

k.Sampling techniques

  1. Random Sampling,Systematic Sampling,Cluster Sampling,Weighted Sampling,Stratified Sampling
  2. a.biased sampling
  3. b.unbiased sampling

l.Feature Creation

  1. a.Combination of multiple features with mathematical operations
  2. b.Combination of multiple features with a reference value

3.Exploratory Data Analysis(eda)

  1. Explore the dataset by using python or microsoft Excel,Atoti,Power BI,Datapanes,Tableau,TabPy,SAS Business Intelligence and Analytics Tool,QlikView,PyToQlik ,KNIME,Splunk,RapidMiner,Zoho Analytics,Sisense etc...
  2. TabPy: Combining Python and Tableau https://www.kdnuggets.com/2020/11/tabpy-combining-python-tableau.html
  3. atoti https://www.atoti.io/ https://www.youtube.com/watch?v=Hb6mSXa14oo Datapane’s Create a Beautiful Dashboard in Python in a Few Lines of Code https://towardsdatascience.com/datapanes-new-features-create-a-beautiful-dashboard-in-python-in-a-few-lines-of-code-a3c44523292b
  4. Switching from Spreadsheets to Neptune.ai https://neptune.ai/blog/switching-from-spreadsheets-to-neptune-ai
  5. Data Analysis using excel https://www.excel-easy.com/data-analysis.html https://www.educba.com/data-analysis-tool-in-excel/ https://www.youtube.com/watch?v=OOWAk2aLEfk
  6. Power BI In Jupyter Notebooks https://github.com/microsoft/powerbi-jupyter https://analyticsindiamag.com/microsoft-releases-power-bi-in-jupyter-notebooks/
  7. Mito Generating Python By Editing Spreadsheet https://www.youtube.com/watch?v=yy3-C39ra6s https://trymito.io/?source=twitter1
  8. Automate Pivot Table with Python https://towardsdatascience.com/automate-excel-with-python-pivot-table-899eab993966
  9. OpenPyXL: A Python Module For Excel https://analyticsindiamag.com/guide-to-openpyxl-a-python-module-for-excel/
  10. causal interactive dashboards and beautiful visuals https://www.causal.app/,
  11. Visual Programming (Orange) https://orange.biolab.si/
  12. Integrating Tableau With Python https://analyticsindiamag.com/tabpy/ Qlib https://analyticsindiamag.com/qlib/
  13. Data visualization (Matplotlib,Seaborn,DASH,Plotly,Plotly-Express,pyqtgraph,Bokeh,Pandas-Bokeh,Pygal,hvplot,holoviews,chartify,lets-plot,pyqtgraph,glue,plotnine,pygal,bqplot,toyplot,chart,itkwidgets,vedo,ipyvolume,pyvista,glumpy,geopandas,pycountry,geopy,geo-py,pypopulation, geotext,folium,cartopy,gmplo,ipyleaflet,geoviews,geoplot,splot,arviz, hypertools,geoplotlib,Geopandas package,choroplethmaps,Leafmap,Dash,Pydot,Geoplotlib,ggplot,visualizer,Greppo,Altair,folium,geoplot,networkx,graphviz,pydot,pygraphviz,python-igraph,pyvis,pygsp,ipycytoscape,nxviz ipydagred3,Diffbot,etc...)
  14. Dashboarding : bokeh,dash,streamlit,panel,visdom ,voila,wave,jupyter-flex,ipyflex,pandas_bokeh
  15. Openpxl: Automate Excel Reporting Datapane: A Python Library to Build Interactive Reports
  16. Scatterplot,Binned Scatterplot,multi line plot,bubble chart,line charts,bar chart,histogram,boxplot, Pie charts,Line Plot,distplot,Histogram

Gantt Chart,bubble charts,area plot,heat map,index plot,violin plot,time series plot,density plot,dot plot,strip plot,plotly,Choropleth Map,Kepler,PDF,Kernel density function,networkx,Scatter_matrix,Bootstrap_plot,functionvis,Higher-Dimensional Plots,3-D Plots,3D Plots With Matplotlib,3D Plots With Plotly,Animated Plot With Plotly,Word Clouds,HoloViz,Horizontal Bar Graphs,Stacked Bar Graphs,Group Bar Graphs,Raincloud Plotsradviz,bootstrap_plot,lag_plot,JoyPy plots,Gantt Chart,Box and Whisker Plot,Waterfall Chart,Pictogram Chart,Timeline,highlight Table,Bullet Graph,Choropleth Map,Word Cloud,Network Diagram,Correlation Matrices,Bubble clouds,Cartograms,Circle views,Dendrograms,Dot distribution maps,Open-high-low-close charts,Polar areas,Radial trees,Ring Charts,Sankey diagram,Span charts,Streamgraphs,Treemaps,Wedge stack graphs, table charts,lollipop charts,distplot,floWeaver

  1. hvplot A high-level plotting API for the PyData ecosystem built on HoloViews https://hvplot.holoviz.org/
  2. 50-charts https://towardsdatascience.com/how-did-i-classify-50-chart-types-by-purpose-a6b0aa5b812d
  3. all in one https://app.learney.me/
  4. Python Tool For Visualizing and Plotting 2D/3D Scientific Data https://analyticsindiamag.com/guide-to-mayavi-a-python-tool-for-visualizing-and-plotting-2d-3d-scientific-data/
  5. patchworklib - combine multiple py charts easily
  6. 7 Techniques to Visualize Geospatial Data https://www.kdnuggets.com/2017/10/7-techniques-visualize-geospatial-data.html
  7. data to viz https://www.data-to-viz.com/
  8. Interactive plots directly with pandas https://towardsdatascience.com/get-interactive-plots-directly-with-pandas-13a311ebf426
  9. Top 10 Data Visualization Tools https://www.analyticsvidhya.com/blog/2021/04/top-10-data-visualization-tools/ https://www.xenonstack.com/blog/data-visualization-tools/
  10. https://www.analyticsvidhya.com/blog/2021/03/when-to-use-what-plot-a-beginners-guide-to-select-plots-for-visualization/
  11. https://towardsdatascience.com/8-free-tools-to-make-interactive-data-visualizations-in-2021-no-coding-required-2b2c6c564b5b
  12. https://datavizproject.com/ https://datavizcatalogue.com/
  13. https://attachments.convertkitcdnm.com/232198/ee18f415-1406-4e5c-94f1-49a2c6e3ec4e/Statistics-The-Big-Picture-Poster.pdf
  14. https://towardsdatascience.com/8-free-tools-to-make-interactive-data-visualizations-in-2021-no-coding-required-2b2c6c564b5b
  15. HiPlot (high dimensional data)-https://github.com/facebookresearch/hiplot https://levelup.gitconnected.com/learn-hiplot-in-6-mins-facebooks-python-library-for-machine-learning-visualizations-330129d558ac
  16. https://towardsdatascience.com/top-6-python-libraries-for-visualization-which-one-to-use-fe43381cd658
  17. https://www.kaggle.com/abhishekvaid19968/data-visualization-using-matplotlib-seaborn-plotly
  18. 𝗞𝗲𝗿𝗮𝘀 𝗠𝗼𝗱𝗲𝗹 𝘃𝗶𝘀𝘂𝗮𝗹𝗶𝘇𝗮𝘁𝗶𝗼𝗻 𝗴𝗲𝗻𝗲𝗿𝗮𝘁𝗼𝗿(ann-visualizer)- 𝗽𝗶𝗽𝟯 𝗶𝗻𝘀𝘁𝗮𝗹𝗹 𝗴𝗿𝗮𝗽𝗵𝘃𝗶𝘇
  19. univariate and bivariate and multivariate analysis
  20. model visualization Tensorboard,netron,playground tensorflow,plotly,TensorDash,Dash,Microscope,Lucid
  21. distributions(discerte,continous)
  22. data distributions-normal distribution,Standard Normal Distribution,Student's t-Distribution,Bernoulli Distribution,Binomial Distribution,Poisson Distribution,Uniform Distribution,F Distribution,Covariance and Correlation
  23. Pingouin statistical package https://pingouin-stats.org/index.html https://www.youtube.com/watch?v=zqi51Wu5qC0
  24. Types of Statistics
  25. 1.Descriptive
  26. Descriptive statistics :Mean, mode, standard deviation, median ,absolute deviation, kurtosis, skewness
  27. 2.Inferential
  28. Types of data
  29. 1) Categorical (nomial,ordinal)
  30. 2) Numerical (discerte,continous)
  31. random variable(discerte random variable ,continous random variable)
  32. Quantile statistics Q1, Q2, Q3, min, max, range, interquartile range
  33. Central Limit Theorem,Bayes Theorem,Confidence Interval,Hypothesis Testing,z test, t test,f test,Confidence Interval,1 tail test, 2 tail test,chisquare test,anova test,A/B testing
  34. Categorical vs Categorical Chi-square test,Information gain,Cramer’s V
  35. Categorical vs Numerical Student T-test,ANOVA,Logistic regression,Discretize Y (left column),Point-biserial correlation
  36. Numerical vs Categorical Student T-test,ANOVA,Logistic regression,Discretize X (row above)
  37. Numerical vs Numerical Correlation,Linear Regression,Discretize Y (left column),Discretize X (row above)

4.Feature selection https://github.com/solegalli/feature-selection-for-machine-learning

  1. upgini Free automated data enrichment library for machine learning https://github.com/upgini/upgini https://upgini.com/
  2. FeatureSelector https://github.com/WillKoehrsen/feature-selector feature_engine https://github.com/solegalli/feature_engine
  3. 1.Filter methods (Removing Constant feature,Removing Quasi constant feature,Removing Duplication feature,Removing Correlated Features,feature importance,chisquare test,Ttest,ftest,vif,anova test,information gain,F-score,Mutual Information,hypothesis test,information gain,Univariate Selection Methods,SelectKBest,SelectPercentile,Variance threshold,Fishers Score,Dispersion ratio Mean Absolute Difference (MAD), constant features elimination, quasi-constant features elimination, duplicate feature elimination,univariate method,mutual information, correlation etc...),Correlation Coefficient,Variance Threshold ,Mean Absolute Difference (MAD),Dispersion ratio,Variance inflation,factor Condition Index
  4. 2.Wrapper methods (recursive feature eliminiation,Recursive feature addition,SelectKbest,boruta,mRMR,forward feature selection,backward feature elimination,Bi-directional selection,exhaustic feature selection,stepwise selection,step forward selection,step backward selection and exhaustive search etc...)
  5. 3.Embedded method (lasso regression,ridge regression,elastic net regression,tree based(Tree-based methods like Random Forest Importance etc...),Feature Selection by Tree importance,Feature selection with decision trees,regression coefficients(logistic,linear coeffiicients),Recursive feature elimination based on importance,Least absolute deviation)
  6. 4.Hybrid Method(Recursive Feature Selection,Recursive Feature addition,Recursive feature elimination,Feature Shuffling,Feature performance,Target mean performance,Permutation importance,Population stability index,Target encoding)
  7. unsupervised Feature selection:Principal Component Analysis,Independent Component Analysis,Non-Negative Matrix Factorization,t-distributed Stochastic Neighbor Embedding,Autoencoder
  8. Single-Agent Reinforcement Learning Feature Selection (SARLFS) ,Multi-Agent Reinforcement Learning Feature Selection (MARLFS)
  9. ITMO_FS is a feature selection library https://github.com/ctlab/ITMO_FS
  10. Sparse Features - Removing features,LASSO regularization,features dense(pca,Feature hashing),Using models that are robust to sparse features
  11. 5.Feature creation
  12. feature selection https://medium.com/analytics-vidhya/feature-selection-extended-overview-b58f1d524c1c
  13. mrmr_selection automatic feature selection at scale https://github.com/smazzanti/mrmr
  14. Feature selector https://github.com/WillKoehrsen/feature-selector
  15. Simulated Annealing https://github.com/kennethleungty/Simulated-Annealing-Feature-Selection
  16. boruta https://github.com/scikit-learn-contrib/boruta_py https://github.com/Ekeany/Boruta-Shap
  17. DropConstantFeatures DropDuplicateFeatures DropCorrelatedFeatures
  18. step forward feature selection https://www.kdnuggets.com/2018/06/step-forward-feature-selection-python.html
  19. automatic feature selection mrmr https://github.com/smazzanti/mrmr
  20. Creating New Features Deep Feature Synthesis https://docs.featuretools.com/en/v0.16.0/automated_feature_engineering/afe.html
  21. SequentialFeatureSelector: The popular forward and backward feature selection
  22. Alternative feature selection methods Feature shuffling,Feature performance,Target mean performance
  23. Automatic Feature Selection : recursive feature elimination and cross-validation
  24. Powershap: A Shapley feature selection method https://github.com/predict-idlab/powershap
  25. VarianceThreshold,Chi-squared stats,ANOVA using f_classif,Univariate Linear Regression Tests using f_regression,F-score vs Mutual Information,Mutual Information for discrete value,Mutual Information for continues value,SelectKBest,SelectPercentile,SelectFromModel,Recursive Feature Elimination,Extra Trees model
  26. 4.Feature Importance
  27. a.ExtraTreesClassifier,ExtraTreesregressor
  28. b.SelectKBest
  29. c.Logistic Regression
  30. d.Random_forest_importance,Permutation Feature Importance
  31. e.decision tree
  32. f.Linear Regression
  33. g.xgboost
  34. h.Pearson correlation
  35. Forward selection,Chi-square,Logit (Logistic Regression model)
  36. 5.curse of dimensionality (as dimension increases performance decreases)
  37. 6.highly correleated features then can take any 1 feature (multicollinearity)
  38. 7.dimension reduction
  39. 8.lasso regression to penalise unimportant features
  40. 9.VarianceThreshold ,selectkbest
  41. 10.model based selection
  42. 11.Mutual Information Feature Selection
  43. 12.remove features with very low variance (quasi constant feature dropping)
  44. 13.Univariate feature selection
  45. 14.importance of feature (random forest importance)
  46. 15.feature importance with decision trees
  47. 16.PyImpetus
  48. 17.drop constant features (variance=0) , Drop Highly Correlated Features
  49. 18.variance inflation factor(vif)
  50. 19.Recursive Feature Elimination RecursiveFeatureAddition
  51. 20.exchaustive feature selection
  52. 21.Statistical Methods , Hypothesis Testing ,Recursive Feature Elimination
  53. 22.Boruta https://github.com/scikit-learn-contrib/boruta_py https://analyticsindiamag.com/hands-on-guide-to-automated-feature-selection-using-boruta/
  54. 23.Sequence Feature Selection, SelectFromModel
  55. Missing Value Ratio Analysis,Low Variance Filter,High Correlation Filter,Backward Feature Elimination,Forward Feature Elimination ,SequentialFeatureSelector
  56. PyImpetus https://github.com/atif-hassan/PyImpetus
  57. https://www.analyticsvidhya.com/blog/2016/12/introduction-to-feature-selection-methods-with-an-example-or-how-to-select-the-right-variables/
  58. Automate your Feature Selection Workflow in one line of Python code https://github.com/AutoViML/featurewiz https://towardsdatascience.com/automate-your-feature-selection-workflow-in-one-line-of-python-code-3d4f23b7e2c4
  59. https://machinelearningmastery.com/feature-selection-with-real-and-categorical-data/ https://machinelearningmastery.com/statistical-hypothesis-tests-in-python-cheat-sheet/
  60. https://www.analyticsvidhya.com/blog/2020/10/a-comprehensive-guide-to-feature-selection-using-wrapper-methods-in-python/
  61. https://towardsdatascience.com/5-feature-selection-method-from-scikit-learn-you-should-know-ed4d116e4172
  62. Feature Engineering Tools https://neptune.ai/blog/feature-engineering-tools?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-feature-engineering-tools
  63. https://towardsdatascience.com/practical-code-implementations-of-feature-engineering-for-machine-learning-with-python-f13b953d4bcd
  64. PyRasgo https://github.com/rasgointelligence/PyRasgo https://docs.rasgoml.com/rasgo-docs/?_ga=2.209281745.2123722956.1645542654-525286113.1645542654
  65. Automated Feature Engineering Using Deep Feature Synthesis (DFS) https://heartbeat.comet.ml/introduction-to-automated-feature-engineering-using-deep-feature-synthesis-dfs-3feb69a7c00b
  66. Automatic Feature Selection in python https://verstack.readthedocs.io/en/latest/#featureselector
  67. rulefit https://github.com/christophM/rulefit
  68. Featurewiz: Fast way to select the best features in a data
  69. select best features featurewiz https://github.com/AutoViML/featurewiz
  70. Featuretools: https://github.com/alteryx/featuretools https://analyticsindiamag.com/introduction-to-featuretools-a-python-framework-for-automated-feature-engineering/
  71. AutoFeat: https://github.com/cod3licious/autofeat
  72. TSFresh: https://github.com/blue-yonder/tsfresh
  73. FeatureSelector: https://github.com/WillKoehrsen/feature-selector
  74. unsupervised feature selection technique https://github.com/atif-hassan/FRUFS
  75. rulefit https://github.com/christophM/rulefit

5.Data splitting

  1. Splitting ratio of data deponds on size of dataset available
  2. Training data,Validation data,Testing data

6.Model selection

Machine learning https://scikit-learn.org/stable/index.html

Choose the Right Machine Learning Algorithm for Your Application https://towardsdatascience.com/how-to-choose-the-right-machine-learning-algorithm-for-your-application-1e36c32400b9

Time Complexity Of Machine Learning Models -https://www.thekerneltrip.com/machine/learning/computational-complexity-learning-algorithms/

interactive tools https://github.com/Machine-Learning-Tokyo/Interactive_Tools

mindsdb In-Database Machine Learning https://github.com/mindsdb/mindsdb

HTML tables into Google Sheets -https://towardsdatascience.com/import-html-tables-into-google-sheets-effortlessly-f471eae58ac9

Machine Learning Playground https://ml-playground.com/

visual introduction to machine learning http://www.r2d3.us/visual-intro-to-machine-learning-part-1/

draw a dataset from inside jupyter https://pypi.org/project/drawdata/ https://www.youtube.com/watch?v=b0rsDPQ3bjg

Visual programming language for machine learning - Kobra https://kobra.dev/

compose generate labels for supervised learning https://github.com/alteryx/compose https://analyticsindiamag.com/guide-to-prediction-engineering-with-compose/

human-learn https://towardsdatascience.com/human-learn-create-rules-by-drawing-on-the-dataset-bcbca229f00

Neural Network https://playground.tensorflow.org/#activation=tanh&batchSize=10&dataset=circle®Dataset=reg-plane&learningRate=0.03®ularizationRate=0&noise=0&networkShape=4,2&seed=0.46672&showTestData=false&discretize=false&percTrainData=50&x=true&y=true&xTimesY=false&xSquared=false&ySquared=false&cosX=false&sinX=false&cosY=false&sinY=false&collectStats=false&problem=classification&initZero=false&hideText=false

Microscope https://microscope.openai.com/models https://www.youtube.com/watch?v=y0-ISRhL4Ks

Ptpython Autocompletion, Autosuggestion, Docstring https://github.com/prompt-toolkit/ptpython https://towardsdatascience.com/ptpython-a-better-python-repl-6e21df1eb648

3 Tools to Track and Visualize the Execution of your Python Code https://towardsdatascience.com/3-tools-to-track-and-visualize-the-execution-of-your-python-code-666a153e435e

ML Code memory Consuming https://towardsdatascience.com/how-much-memory-is-your-ml-code-consuming-98df64074c8f

PyGrid Privacy-preserving, Decentralized Data Science https://github.com/OpenMined/PyGrid/

Best and Worst Cases of Machine-Learning Models https://medium.com/towards-artificial-intelligence/best-and-worst-cases-of-machine-learning-models-part-1-36cdb9296611

https://www.youtube.com/watch?v=mlumJPFvooQ&list=PLZoTAELRMXVM0zN0cgJrfT6TK2ypCpQdY

skater Machine Learning Model Interpretation https://towardsdatascience.com/machine-learning-model-interpretation-47b4bc29d17f

Speedml Speeding up Machine Learning https://towardsdatascience.com/speedml-speeding-up-machine-learning-5dccbf21effd

2-2000x faster ML algos https://github.com/danielhanchen/hyperlearn

snapml 30 Times Faster Than Scikit-Learn snapml https://www.zurich.ibm.com/snapml/

scikit-learn-intelex https://github.com/intel/scikit-learn-intelex

composer speed-up algorithms for model training https://github.com/mosaicml/composer

pdpipe https://github.com/pdpipe/pdpipe pipeline https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html

PHOTONAI A high level Python API for designing and optimizing machine learning pipelines https://www.photon-ai.com/

Machine Learning in Tableau with PyCaret https://towardsdatascience.com/machine-learning-in-tableau-with-pycaret-166ffac9b22e

TabNet balances explainability and model performance on tabular data https://towardsdatascience.com/tabnet-e1b979907694

FreaAI That Automatically Finds Weaknesses In ML models https://analyticsindiamag.com/ibm-launches-freaai-that-automatically-finds-weaknesses-in-ml-models/

A.Supervised learning (have label data)

  1. Transformers for Tabular Data: TabTransformer https://github.com/lucidrains/tab-transformer-pytorch
  2. 1.Regression (output feature in continous data form)
  3. linear regression,Multiple Linear Regression,polynomial regression,Exponential Regression,Bayesian Regression,Robust Regression,Huber regressor,support vector regression,Decision Tree Regression,Random Forest Regression,TensorFlow Decision Forests,RANSAC Regression,
  4. least square method,linear-tree,Random Forest Regression, Regularized Greedy Forests,xgboost,ridge(L2 Regularization),lasso(L1 Regularization (more sparse)),elastic, Lars,catboost,gradientboosting,adaboost,Explainable Boosting Machine,Histogram-Based Gradient Boost,Stacked Gradient Boosting Machines,LightBoost,CatBoost, XGBoost,autoxgb,NGBoost,XBNet,Chefboost,GPBoost,Local Cascade Ensemble,Principal Component Regression,huber_regression,ransac_regression,theilsen_regression,Linear spline,Isotonic regression,Bin regression,Cubic spline,Natural cubic splin,Exponential moving average,Quantile Regression,Quantile Random Forests,Quantile GBM
  5. elsatic net,light gbm,ordinary least squares,cart,Stepwise Regression,Multivariate Adaptive Regression Splines ,Generalised Additive Model(learn non-linear feature),tabnet,Linear Tree regression
  6. statsassume Automating Assumption Checks for Regression Models https://github.com/kennethleungty/statsassume
  7. Locally Weighted Linear Regression https://towardsdatascience.com/locally-weighted-linear-regression-in-python-3d324108efbf
  8. TuringBot https://www.youtube.com/watch?v=LyKzKvjyIPo
  9. chefboost is an alternative library for training tree-based models https://github.com/serengil/chefboost
  10. growtrees About Cost-Aware Robust Tree Ensembles for Security Applications https://github.com/surrealyz/growtrees
  11. 2.Classification (output feature in categorical data form)
  12. Binary,Multi-class,Multi-labe
  13. Logistic Regression,K-Nearest Neighbors,Support Vector Machine,Kernel SVM,Naive Bayes,Decision Tree Classification,linear-tree,TensorFlow Decision Forests,
  14. Random Forest Classification,TensorFlow Decision Forests, Regularized Greedy Forests,xgboost,DART booster,autoxgb,LightGBM,adaboost,Gradient Boost,XBNet,catboost,gaussian NB,LGBMClassifier,LinearDiscriminantAnalysis, Extreme Gradient Boosting Machine, Explainable Boosting Machine,fairgbm

,Chefboost,GPBoost,NGBoost,Local Cascade Ensemble,passive aggressive classifier algorithm,cart,c4.5,c5.0,tabnet,ExtraTreesClassifier,TabPFN

  1. https://mlwhiz.com/blog/2019/11/12/dtsplits/?utm_campaign=the-simple-math-behind-3-decision-tree-splitting-criterions&utm_medium=social_link&utm_source=missinglettr-linkedin
  2. 4 Useful techniques avoid overfitting in decision trees https://towardsdatascience.com/4-useful-techniques-that-can-mitigate-overfitting-in-decision-trees-87380098bd3c
  3. Machine Learning its all about assumptions https://www.kdnuggets.com/2021/02/machine-learning-assumptions.html
  4. GPBoost: A Library To Combine Tree-Boosting With Gaussian Process And Mixed-Effects Models https://analyticsindiamag.com/guide-to-gpboost-a-machine-learning-library-to-combine-tree-boosting/
  5. Data and Concept Drift https://evidentlyai.com/blog/machine-learning-monitoring-data-and-concept-drift

B.Unsupervised learning(no label(target) data)

  1. 1.Dimensionality reduction - PCA,ppa,SVD,LDA,som,tsne,openTSNE,plsr,pcr,autoencoders,kernelpca,Latent Semantic Analysis,Factor Analysis,Locality Preserving Projections,Isometric Mapping,Multiple correspondence analysis (MCA),Multiple factor analysis (MFA),Factor analysis of mixed data (FAMD),vae,CompressionVAE,Gaussian Mixture Model,Bayesian Gaussian Mixture Model
  2. non-linear data using Kernel PCA, Non-Negative Matrix Factorization(NMF), IsoMap, t-SNE, and UMAP,TDA(Topological Data Analysis)
  3. t-SNE Effectively https://distill.pub/2016/misread-tsne/
  4. 2.Clustering : Centroid-based Model ,Density-based Model ,Distribution-based Model,Connectivity-based model
  5. 17 clustering https://towardsdatascience.com/17-clustering-algorithms-used-in-data-science-mining-49dbfa5bf69a
  6. https://neptune.ai/blog/clustering-algorithms?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-clustering-algorithms
  7. classix Fast and explainable clustering based on sorting https://github.com/nla-group/classix
  8. https://www.mygreatlearning.com/blog/unsupervised-machine-learning/?highlight=unsupervised%20machine%20learning&utm_source=GLA&utm_medium=Blog&utm_campaign=1-16th%20May
  9. https://scikit-learn.org/stable/modules/clustering.html https://machinelearningmastery.com/clustering-algorithms-with-python/
  10. https://towardsdatascience.com/17-clustering-algorithms-used-in-data-science-mining-49dbfa5bf69a
  11. RFM Segmentation in E-Commerce https://towardsdatascience.com/rfm-segmentation-in-e-commerce-e0209ce8fcf6
  12. kmodes https://www.youtube.com/watch?v=8eATPLDJ0NQ
  13. Agglomerative Hierarchical Clustering Using AGNES Algorithm https://analyticsindiamag.com/perform-agglomerative-hierarchical-clustering-using-agnes-algorithm/
  14. CLARANS Clustering Algorithm https://analyticsindiamag.com/comprehensive-guide-to-clarans-clustering-algorithm/
  15. https://pub.towardsai.net/fully-explained-birch-clustering-for-outliers-with-python-2ad6243f126b
  16. https://www.kdnuggets.com/2020/12/algorithms-explained-k-means-k-medoids-clustering.html
  17. https://www.kdnuggets.com/2017/03/naive-sharding-centroid-initialization-method.html
  18. CLASSIX clustering https://github.com/nla-group/classix
  19. K-Means 8x faster, 27x lower error than Scikit-learn in 25 lines https://www.kdnuggets.com/2021/01/k-means-faster-lower-error-scikit-learn.html#.YAHAAIpnx4A.linkedin
  20. k-Means Clustering by up to 10x Over Scikit-Learn https://towardsdatascience.com/how-to-speed-up-your-k-means-clustering-by-up-to-10x-over-scikit-learn-5aec980ebb72
  21. 3.Association Rule Learning - support,lift,confidence,leverage,Conviction,aprior,elcat,Fp-growth,Fp-tree construction,FP-Max Algorithm,association_rules,Frequent Itemset Mining,Multi-Relation Association Rules,High-order pattern discovery,K-optimal pattern discovery,Approximate Frequent Itemset,Generalized Association Rules,Quantitative Association Rules,Interval Data Association Rules,Sequential pattern mining,Hypergeometric Networks,Constraint Based Mining,Multi-level Association Rules,Fuzzy Association Rules
  22. Sequential Patterns
  23. Generalized Sequential Patterns (GSP)
  24. Prefix-Projected Sequential Pattern Mining (PrefixSpan)
  25. Sequential Pattern Discovery using Equivalent Class (SPADE)
  26. Frequent Pattern-Projected Sequential Pattern Mining (FreeSpan)
  27. interpretable association rule https://analyticsindiamag.com/a-guide-to-interpretable-association-rule-mining-using-pycaret/
  28. 4.Market Segmentation
  29. Demographic Segmentation,Geographic segmentation,Firmographic segmentation,Behavioural segmentation,
  30. 4.Recommendation system - Surprise,TensorFlow Recommendation,Recmterics
  31. competitive-recsys https://github.com/chihming/competitive-recsys
  32. a.collaborative Recommendation system (model based, memory based(item based,user based),hybrid) user-item interaction matrix
  33. Classification-based collaborative filtering
  34. Model-based collaborative filtering systems(Cluster model,linear regression,Bayesian networks ,latent factor(probabilistic latent,matrix factorization(als,SGD,SVD),neural network,lda))
  35. b.content based Recommendation system
  36. similarity based(user-user similarity,item-item similarity)
  37. matrix factorization(SVD and SVD++),Popularity-based recommenders
  38. c.utility based Recommendation system
  39. d.knowledge based Recommendation system
  40. e.demographic based Recommendation system
  41. f.hybrid based Recommendation system
  42. Popularity based Recommendation system (NON-PERSONALIZED )
  43. g.Average Weighted Recommendation
  44. h.using K Nearest Neighbor
  45. i.cosine distance recommender system
  46. item2vec
  47. j.TensorFlow Recommenders https://www.tensorflow.org/recommenders
  48. recommenders https://github.com/microsoft/recommenders
  49. Neural Collaborative Filtering for Personalized Ranking
  50. AutoRec: Rating Prediction with Autoencoders Matrix Factorization
  51. k.suprise baseline model
  52. Context-aware Recommender Systems,Mobile Recommender Systems,Group Recommender Systems,Multi-stakeholder Recommender Systems
  53. l.Neural Collaborative Filtering (NCF)
  54. l.Tf-Rec TensorFlow Recommendation https://github.com/Praful932/Tf-Rec
  55. Nvidia Merlin
  56. m.Deep Learning Recommendation Models https://www.kdnuggets.com/2021/04/deep-learning-recommendation-models-dlrm-deep-dive.html
  57. Restricted Boltzmann Machines,Auto-Encoders
  58. TOROS Buffalo https://github.com/kakao/buffalo
  59. recommenders-https://github.com/microsoft/recommenders
  60. LightFM https://making.lyst.com/lightfm/docs/home.html
  61. lkpy Python recommendation toolkit https://github.com/lenskit/lkpy https://analyticsindiamag.com/how-to-build-recommender-systems-using-lenskit/
  62. torchrec https://github.com/pytorch/torchrec
  63. PyTorch implementations of deep reinforcement learning algorithms and environments https://github.com/p-christ/Deep-Reinforcement-Learning-Algorithms-with-PyTorch
  64. recmetrics library of metrics for evaluating recommender systems https://github.com/statisticianinstilettos/recmetrics
  65. Downsize Recommendation Models By 112 Times https://analyticsindiamag.com/explained-facebooks-novel-method-to-downsize-recommendation-models-by-112-times/
  66. torchrec,Lenskit,RGRecSys,Surprise,Tensorflow Recommenders,NVIDIA-Merlin,Recmetrics,Surprise,DeepCTR,OpenRec,fastFM,LightFM
  67. Session-based RecSys could be done with:Recency-based Weighting (exp.decay),Probabilistic Graphical Models (FPMC, FOSSIL),Convolutional NN (Caser, NextItNet),Recurrent NN (GRU4Rec),Graph NN (SRGNN, GCSAN),Attention(STAMP, NARM, FDSA, SHAN),Transformer(BERT4Rec, Transformer4Rec),Knowledge Graph(KSR, GRU4RecKG, KGCN, KGAT, RippleNet),Landscape, Rexy, Tensor Recommendation Engine, Light FM, Spotlight, Case Recommender
  68. https://analyticsindiamag.com/top-open-source-recommender-systems-in-python-for-your-ml-project/
  69. https://towardsdatascience.com/modern-recommender-systems-a0c727609aa8
  70. https://machinelearningmastery.com/recommender-systems-resources/

C.Ensemble methods

  1. 1.Stacking models https://www.analyticsvidhya.com/blog/2021/03/advanced-ensemble-learning-technique-stacking-and-its-variants/?
  2. vecstack https://github.com/vecxoz/vecstack
  3. Cascading Ensembles,Cohorted Ensembles
  4. 2.Bagging models (Bagging (with the replacement) , Pasting ( without replacement ))
  5. 3.Boosting models
  6. 4.Blending
  7. 5.Voting (Hard Voting,Soft Voting)
  8. VOTING ENSEMBLE
  9. Simple : Max Voting, Averaging, Weighted Averaging,Simple Average,Rank Averaging,Bayesian Model,Majority Voting
  10. mlens ML-Ensemble high performance ensemble learning https://github.com/flennerhag/mlens
  11. https://analyticsindiamag.com/do-ensemble-methods-always-work/
  12. Shapley value of players (models) in weighted voting games https://github.com/benedekrozemberczki/shapley

D.Reinforcement learning https://neptune.ai/blog/best-reinforcement-learning-tutorials-examples-projects-and-courses

  1. 2 types a)model free b)model based
  2. gym-https://github.com/openai/gym reinforcement learning by using PyTorch-https://github.com/SforAiDl/genrl
  3. agent,environment,policy(On-Policy vs Off-Policy),reward function,value function,state,action,episode,actor-critic
  4. agent apply action to environment get corresponding reward so that it learn environment
  5. How to get started with Reinforcement Learning https://gordicaleksa.medium.com/how-to-get-started-with-reinforcement-learning-rl-4922fafeaf8c
  6. 1.Q-Learning
  7. 2.Deep Q-Learning
  8. 3.Deep Convolutional Q-Learning
  9. Deep Deterministic Policy Gradient
  10. 4.Twin Delayed DDPG,DQN,Temporal difference
  11. 5.A3C (Actor Critic) ,A2C, Soft Actor Critic (SAC),Adversarial Motion Priors (AMP),Cross-Entropy Method (CEM),Deep Deterministic Policy Gradient (DDPG),Double Deep Q-Network (DDQN),Deep Q-Network (DQN),Proximal Policy Optimization (PPO),Q-learning (Q-learning),Soft Actor-Critic (SAC),State Action Reward State Action (SARSA),Twin-Delayed DDPG (TD3),Trust Region Policy Optimization (TRPO)
  12. 6.Advantage weighted actor critic (AWAC).
  13. 7.XCS
  14. 8.genetic algorithm,sarsa,natural policy gradient,Policy Gradient Learning
  15. https://simoninithomas.github.io/deep-rl-course/
  16. SARSA,REINFORCE,PPO,DDPG,Ddpg,TD3
  17. AUTORL: AUTOML FOR RL https://www.automl.org/blog-autorl/
  18. Environments-OpenAI Gym, DeepMind Lab, Unity ML-Agents
  19. https://data-flair.training/news/python-libraries-for-reinforcement-learning/
  20. https://analyticsindiamag.com/8-best-free-resources-to-learn-deep-reinforcement-learning-using-tensorflow/
  21. https://analyticsindiamag.com/top-8-autonomous-driving-open-source-projects-one-must-try-hands-on/
  22. https://analyticsindiamag.com/8-toolkits-for-reinforcement-learning-models-that-make-reasoning-explainability-core-to-ai/
  23. https://neptune.ai/blog/best-reinforcement-learning-tutorials-examples-projects-and-courses
  24. https://towardsdatascience.com/value-based-methods-in-deep-reinforcement-learning-d40ca1086e1
  25. https://neptune.ai/blog/best-reinforcement-learning-tutorials-examples-projects-and-courses?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-best-reinforcement-learning-tutorials-examples-projects-and-courses
  26. TensorForce: A TensorFlow-based Reinforcement Learning Framework https://analyticsindiamag.com/guide-to-tensorforce-a-tensorflow-based-reinforcement-learning-framework/
  27. Decision Transformer: Reinforcement Learning via Sequence Modeling https://github.com/kzl/decision-transformer
  28. Open AI Gym - https://gym.openai.com/
  29. DeepMinds MuZero https://deepmind.com/blog/article/muzero-mastering-go-chess-shogi-and-atari-without-rules?utm_campaign=Learning%20Posts&utm_content=150411901&utm_medium=social&utm_source=twitter&hss_channel=tw-3018841323
  30. KerasRL https://github.com/keras-rl/keras-rl
  31. pyqlearning
  32. tensorforce https://tensorforce.readthedocs.io/en/latest/index.html
  33. Practical_RL https://github.com/yandexdataschool/Practical_RL
  34. rl_coach https://github.com/IntelLabs/coach#installation MushroomRL https://mushroomrl.readthedocs.io/en/latest/
  35. TFAgents https://github.com/tensorflow/agents (https://www.tensorflow.org/agents) https://deepmind.com/blog/article/trfl
  36. TorchRec https://pytorch.org/blog/introducing-torchrec/ TensorFlow Recommenders https://www.tensorflow.org/recommenders
  37. behaviour trees used in reinforcement learning https://analyticsindiamag.com/how-are-behaviour-trees-used-in-reinforcement-learning/
  38. Automate The Stock Market Using FinRL (Deep Reinforcement Learning Library) https://analyticsindiamag.com/stock-market-prediction-using-finrl/
  39. Stable Baselines https://github.com/openai/baselines
  40. https://www.youtube.com/playlist?list=PL_iWQOsE6TfURIIhCrlt-wj9ByIVpbfGc
  41. https://neptune.ai/blog/the-best-tools-for-reinforcement-learning-in-python?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-the-best-tools-for-reinforcement-learning-in-python

Semi-Supervised Learning-small amount of labeled data with a large amount of unlabeled data during training

Machine Learning with Graphs http://web.stanford.edu/class/cs224w/

E.Deep-learning (use when have huge data and data is highly complex and state of art for unstructured data) https://www.kdnuggets.com/2019/11/designing-neural-networks.html

Model Zoo Discover open source deep learning code and pretrained models https://modelzoo.co/

Visualizing your Neural Network with Netron,Net2Vis,visualkeras,draw_convnet,NNSVG,PlotNeuralNet,Tensorboard,Caffe,Matlab,Keras.js,keras-sequential-ascii ,Netron,DotNet,Graphviz ,Keras Visualization,Conx,ENNUI,NNet,GraphCore ,Monial,Quiver

Sharing the best resources on various machine learning topics https://www.backprop.org/

deeplearning-models-https://github.com/rasbt/deeplearning-models

Deep-Learning-with-PyTorch- https://pytorch.org/assets/deep-learning/Deep-Learning-with-PyTorch.pdf

Frameworks:Pytorch,Tensorflow,Keras,caffe,theano,MXNet,Matlab,Microsoft Cognitive Toolkit,opacus(Train PyTorch models with Differential Privacy)

https://towardsdatascience.com/the-mostly-complete-chart-of-neural-networks-explained-3fb6f2367464 https://docs.deepstack.cc/getting-started/index.html

fastest way to build, debug, and interpret neural networks https://www.perceptilabs.com/

Nengo: A New Neural Network Building and Deployment Tool https://pub.towardsai.net/nengo-a-new-neural-network-building-and-deployment-tool-66677c65fa19

Binarized Neural Network memory size is reduced, and bitwise operations improve the power efficiency https://neptune.ai/blog/binarized-neural-network-bnn-and-its-implementation-in-ml

paddlehub https://github.com/PaddlePaddle/PaddleHub Performing Computer Vision & NLP Tasks in a Single Of Code https://towardsdatascience.com/performing-computer-vision-nlp-tasks-in-a-single-of-code-f7205f212d34

scikit-neuralnetwork https://towardsdatascience.com/the-simplest-way-to-train-a-neural-network-in-python-17613fa97958 https://github.com/aigamedev/scikit-neuralnetwork

NVIDIA’s Kaolin: A 3D Deep Learning Library https://analyticsindiamag.com/nvidias-kaolin-3d-deep-learning-library/ https://github.com/NVIDIAGameWorks/kaolin

PySyft is a Python library for secure and private Deep Learning https://github.com/OpenMined/PySyft

keras-vis Visualizing Learning of a Deep Neural Network https://towardsdatascience.com/deep-learning-model-visualization-6cf6290dc981

Deep Replay Visualizing Learning of a Deep Neural Network https://towardsdatascience.com/visualizing-learning-of-a-deep-neural-network-b05f1711651c

keras-visualizer Visualizing Keras Models https://towardsdatascience.com/visualizing-keras-models-4d0063c8805e

Lucid Library is an open source framework to improve the interpretation of deep neural networks

Gradient-Centralization-TensorFlow improve your training performance of TensorFlow models with just 2 lines of code! https://github.com/Rishit-dagli/Gradient-Centralization-TensorFlow

XBNet: An Extremely Boosted Neural Network

MIL-WebDNN Fastest DNN Execution Framework on Web Browser https://mil-tokyo.github.io/webdnn/

Vector Hub models to turn data into vectors text2vec, image2vec, video2vec, graph2vec, bert, inception, etc https://github.com/RelevanceAI/vectorhub

torchbearer: A model fitting library for PyTorch https://github.com/pytorchbearer/torchbearer

1.Multilayer perceptron(MLP)

  1. 1.Regression task
  2. 2.Classification task
  3. Tabnet and deep tables for tabular dataset using deep learning

2.Convolutional neural network ( use for image data)

  1. Best MLOps Tools for Your Computer Vision Project Pipeline https://neptune.ai/blog/best-mlops-tools-for-computer-vision-project?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-best-mlops-tools-for-computer-vision-project
  2. mediapipe https://google.github.io/mediapipe/ cv modelhub https://modelplace.ai/
  3. all openmmlab https://github.com/open-mmlab mmdetection,mmsegmentation,mmediting,mmdetection3d,mmaction2,mmocr,mmpose,etc...
  4. glasses High-quality Neural Networks for Computer Vision https://github.com/FrancescoSaverioZuppichini/glasses
  5. IceVision https://airctic.com/0.8.0/
  6. Top Computer Vision Google Colab Notebooks- https://www.qblocks.cloud/creators/computer-vision-google-colab-notebooks
  7. for low code object detection (detecto)- https://github.com/alankbi/detecto
  8. CV-pretrained-model- https://github.com/balavenkatesh3322/CV-pretrained-modelCV-pretrained-model-
  9. Fast Computer Vision Model Building PyTorch Lightning Flash and FiftyOne https://towardsdatascience.com/open-source-tools-for-fast-computer-vision-model-building-b39755aab490
  10. 5 Open-Source Facial Recognition https://medium.com/analytics-vidhya/ways-to-boost-your-computer-vision-projects-by-using-5-open-source-facial-recognition-projects-56668f170cb9
  11. cnn alternative CapsNet https://github.com/XifengGuo/CapsNet-Keras
  12. EDA for image data data-gradients
  13. 1.Classification of image
  14. albumentations https://github.com/albumentations-team/albumentations AugLy https://github.com/facebookresearch/AugLy
  15. create own model,Lenet,Alexnet,DenseNet,MobileNet,ShuffleNet,SqueezeNet,Resenet,GoogleNet,Inception,Vgg16,vgg19,Efficient,EfficientNetV2,EfficientDet,residualnet,Nasnet,STN,nasneta,senet,amoebanetc,DeiT (tiny,small,base),Meta Pseudo Labels,res-mlp-pytorch,MLP-Mixer,vit,DynamicViT, FNet,gMLP models,nfnet
  16. mmclassification https://github.com/open-mmlab/mmclassification
  17. https://theaisummer.com/cnn-architectures/ https://paperswithcode.com/sota/image-classification-on-imagenet
  18. timm https://pypi.org/project/timm/ https://github.com/rwightman/pytorch-image-models
  19. 2.Localization of object in image
  20. 3.Object detection and object segmentation
  21. rcnn,fastrcnn,fastercnn,TensorFlow Object Detection,yolo v1,yolo v2,yolo v3,SlimYOLOv3,yolo v4,PP-YOLO,scaled yolov4,YOLOR,YoloV5,YOLOS,efficinetdet,fast yolo,yolo tiny,yolo lite,yolo tiny++,yolo act++,yolonas,yolov8
  22. maskrcnn,DeepLab-v3-plus,ssd,detectron,detectron2,D2Go,mobilenet,retinanet,R-fcn,Libra_R-CNN,detr facebook,mdetr,pspnet,segnet,U-net,UNet++,Efficient U-Nets, 𝗗𝗲𝗻𝘀𝗲-𝗚𝗮𝘁𝗲𝗱 𝗨-𝗡𝗲𝘁, nnU-Net,v-net,TransUNet, H-DenseUNet, MultiResUNet ,deeplab,globalconvolutionnetwork,fcn,EfficientDet,Vision Transformer,deit,VarifocalNet (VF-Net),DINO,BodyPix,vit,AugFPN,mlsd
  23. PixelLib Simplifying Object Segmentation with PixelLib Library https://github.com/ayoolaolafenwa/PixelLib
  24. mmdetection https://github.com/open-mmlab/mmdetection https://towardsdatascience.com/mmdetection-tutorial-an-end2end-state-of-the-art-object-detection-library-59064deeada3 https://github.com/open-mmlab/mmrotate
  25. mmdetection3d https://github.com/open-mmlab/mmdetection3d mmsegmentation https://github.com/open-mmlab/mmsegmentation
  26. fewshot https://github.com/open-mmlab/mmfewshot
  27. Zero-Shot Object Detection , annotate dataset https://github.com/microsoft/GLIP
  28. imageai.Detection ObjectDetection Segmentation models https://github.com/qubvel/segmentation_models
  29. Image-Segmentation-Using-Pixellib
  30. IceVision https://airctic.com/0.8.0/
  31. Image Generation Using TensorFlow Keras https://analyticsindiamag.com/getting-started-image-generation-tensorflow-keras/
  32. Video Understanding https://towardsdatascience.com/video-understanding-made-simple-with-pytorch-video-and-lightning-flash-c7d65583c37e
  33. Getting Started With Object Detection Using TensorFlow https://analyticsindiamag.com/object-detection-using-tensorflow/
  34. Instance Segmentation using Mask-RCNN with PixelLib and Python https://www.youtube.com/watch?v=i_-ud01wFhc
  35. MLP MLP solution for Vision, from Google AI https://github.com/lucidrains/mlp-mixer-pytorch
  36. MMDetection https://analyticsindiamag.com/guide-to-mmdetection-an-object-detection-python-toolbox/ mediapipe https://github.com/google/mediapipe
  37. SSL Framework For Object Detection https://analyticsindiamag.com/googles-stac-ssl-framework-for-object-detection/
  38. GSDT https://analyticsindiamag.com/gsdt-gnns-for-simultaneous-detection-and-tracking/
  39. D2Go Brings Detectron2 To Mobile https://analyticsindiamag.com/facebooks-d2go-brings-detectron2-to-mobile/
  40. AdelaiDet open source toolbox for multiple instance-level detection and recognition tasks https://github.com/aim-uofa/AdelaiDet
  41. 3d object detection https://omdena.com/blog/3d-object-detection/?utm_source=linkedin&utm_medium=organic&utm_campaign=blog&utm_term=google-analytics
  42. PyMAF https://analyticsindiamag.com/guide-to-pymaf-pyramidal-mesh-alignment-feedback/
  43. 3 kind of object segmentation are available semantic segmentation,instance segmentation,panoptic segmentation
  44. segmentation_models https://github.com/qubvel/segmentation_models
  45. https://analyticsindiamag.com/guide-to-panoptic-segmentation-a-semantic-instance-segmentation-approach/ https://analyticsindiamag.com/semantic-vs-instance-vs-panoptic-which-image-segmentation-technique-to-choose/
  46. ResNeSt: A Better ResNet with the Same Costs https://analyticsindiamag.com/guide-to-resnest-a-better-resnet-with-the-same-costs/
  47. PAN: Pyramid Attention Network for Semantic Segmentation https://medium.com/mlearning-ai/review-pan-pyramid-attention-network-for-semantic-segmentation-semantic-segmentation-8d94101ba24a
  48. PyTorch based low code object detection-https://github.com/alankbi/detecto
  49. https://www.kdnuggets.com/2021/03/extraction-objects-images-videos-5-lines-code.html
  50. autogluon
  51. GluonCV https://medium.com/apache-mxnet/start-fitting-cv-models-like-scikit-learn-with-gluoncv-0-10-931ff910a38
  52. https://awesomeopensource.com/project/hoya012/deep_learning_object_detection
  53. 4.objecttracking (mean shit and optical flow and kalman filter)
  54. Tracktor++,Trackrcnn,Jde,DeepSORT,FairMOT
  55. mmtracking https://github.com/open-mmlab/mmtracking https://github.com/open-mmlab/mmflow
  56. mmhuman3d https://github.com/open-mmlab/mmhuman3d
  57. Video Understanding https://github.com/open-mmlab/mmaction2
  58. 5.Deepdream,Neural style transfer, Pose estimation
  59. generative models https://github.com/open-mmlab/mmgeneration
  60. Machine Learning for Art https://ml4a.net/#
  61. Pose estimation by mediapipe library https://google.github.io/mediapipe/ https://www.youtube.com/watch?v=brwgBf6VB0I
  62. posemodule https://www.youtube.com/watch?v=5kaX3ta398w Pose Tracking https://www.youtube.com/watch?v=0JU3kpYytuQ&t=1650s
  63. 6.DEEP LEARNING METHODS FOR 2D :OpenPose,DeepPose,AlphaPose,tfpose,MultiPoseNet,AlphaPose,Movenet lighting,VIBE,DeeperCut,Mask RCNN,DeepCut,Convolutional Pose Machines,PoseNet,MoveNet,Adobes BodyNet,MoveNet and TensorFlow.js,High-Resolution Net,Blaze pose,Deep Pose,PoseNet
  64. openpose wrnchai densepose
  65. mmpose https://github.com/open-mmlab/mmpose
  66. Pose Estimation using OpenCV https://www.analyticsvidhya.com/blog/2021/05/pose-estimation-using-opencv/
  67. https://medium.com/beyondminds/an-overview-of-human-pose-estimation-with-deep-learning-d49eb656739b
  68. 3D POSE ESTIMATION
  69. 3D Image Classification https://keras.io/examples/vision/3D_image_classification/
  70. TensorFlow 2 Object Detection API tutorial https://tensorflow-object-detection-api-tutorial.readthedocs.io/en/latest/
  71. https://blog.paperspace.com/how-to-train-scaled-yolov4-object-detection/
  72. Image DA libraries Augmentor, Albumentations, ImgAug, AutoAugment, Transforms https://neptune.ai/blog/data-augmentation-in-python
  73. Simple transformations-Resize,Gray Scale,Normalize,Random Rotation,Center Crop,Random Crop,Gaussian Blur
  74. Position augmentation-Scaling,Cropping,Flipping,Padding,Rotation,Translation,Affine transformation,Kernel filters
  75. Color augmentation-Brightness,Contrast,Saturation,Hue
  76. Deep learning approach-Adverserial training,Neural style transfer,Gan data argumentation
  77. AS-One Run YOLOv7,v6,v5,R,X in under 20 lines of code https://github.com/augmentedstartups/AS-One
  78. Data augmentation feature space : noise,interpolation Data Space Character Level : Noise Induction,Rule-based Transformations Word Level : Noise Induction,Synonym Replacement,Embedding Replacement,Replacement by Language Models Phrase and Sentence Level : Interpolation,Structure-based Transformation Document Level:Round-trip Translation,Generative Methods
  79. flipping, rotation, scaling ratio, noise injection, changing contrast, translation, cropping, color jittering,AutoAugment,Fast AutoAugment,Population Based Augmentation,RandAugment
  80. More advanced techniques-Gaussian Noise,Random Blocks,Central Region
  81. albumentations https://github.com/albumentations-team/albumentations https://towardsdatascience.com/getting-started-with-albumentation-winning-deep-learning-image-augmentation-technique-in-pytorch-47aaba0ee3f8
  82. AugLy A Modern Data Augmentation Library https://analyticsindiamag.com/complete-guide-to-augly-a-modern-data-augmentation-library/ https://github.com/facebookresearch/AugLy
  83. Data augmentation with tf.data
  84. ImageGenerator image augmentation ImageDataGenerator Albumentations SOLT Imgaug Augmentor,Albumentations,Imgaug,AutoAugment (DeepAugment)
  85. Augmentor Image augmentation library in Python for machine learning https://github.com/mdbloice/Augmentor
  86. albumentations https://github.com/albumentations-team/albumentations
  87. HiSD: Image-to-Image translation via Hierarchical Style Disentanglement https://analyticsindiamag.com/hisd-python-implementation-of-image-to-image-translation/
  88. Zooming Slow-Mo https://analyticsindiamag.com/guide-to-zooming-slow-mo-one-stage-space-time-video-super-resolution/
  89. Image Augmentation Pipelines with Tensorflow https://towardsai.net/p/machine-learning/building-complex-image-augmentation-pipelines-with-tensorflow-bed1914278d2
  90. TensorFlow2.0-Examples https://github.com/YunYang1994/TensorFlow2.0-Examples
  91. unadversarial https://github.com/microsoft/unadversarial/ https://analyticsindiamag.com/microsoft-research-unadversarial/
  92. CNNs 'see' - FilterVisualizations, Heatmaps,Saliency Maps,saliency_map_guided,Heat Map Visualizations,GradCAM,Class Activation Maps,ZFNet,Lucid,Activation Atlas,Blur Integrated Gradients,concept whitening,Integrated Gradients,SmoothGrad,PytorchRevelio,Feature Visualizer, Guided Gradients, grad_cam,sensitivity_analysis,Captum,Preliminary Methods,Plot Model Architecture,Visualize Filters,Activation based Methods,Maximal Activation,Image Occlusion,Gradient based Methods,Gradient based Class Activation Map
  93. Tools to Design or Visualize Architecture of Neural Network https://github.com/ashishpatel26/Tools-to-Design-or-Visualize-Architecture-of-Neural-Network
  94. quiver Interactive convnet features visualization for Keras https://github.com/keplr-io/quiver
  95. https://jair-neto.medium.com/a-powerful-method-for-explainability-of-object-detection-algorithms-ace0fe4623e7
  96. https://github.com/utkuozbulak/pytorch-cnn-visualizations https://microscope.openai.com/models https://github.com/balavenkatesh3322/CV-pretrained-model
  97. Mediapipe for Python https://google.github.io/mediapipe/
  98. imageai.Detection for Object detection
  99. cnn-raccoon interactive dashboards for your Convolutional Neural Networks with a single line of code https://github.com/lucko515/cnn-raccoon
  100. deit https://github.com/facebookresearch/deit https://wandb.ai/thibault-neveu/detr-tensorflow-log/reports/Finetuning-DETR-Object-Detection-with-Transformers-on-Tensorflow-A-step-by-step-tutorial--VmlldzozOTYyNzQ https://github.com/Visual-Behavior/detr-tensorflow
  101. awesome-computer-vision-models https://github.com/nerox8664/awesome-computer-vision-models
  102. EfficientDet https://github.com/ravi02512/efficientdet-keras
  103. Vision Transformer - Pytorch https://github.com/lucidrains/vit-pytorch https://github.com/alohays/awesome-visual-representation-learning-with-transformers
  104. T2T-ViT https://analyticsindiamag.com/complete-guide-to-t2t-vit-training-vision-transformers-efficiently-with-minimal-data/ https://github.com/yitu-opensource/T2T-ViT
  105. Explainability for Vision Transformers https://github.com/jacobgil/vit-explain
  106. https://keras.io/examples/vision/image_classification_with_vision_transformer/
  107. https://github.com/ashishpatel26/Vision-Transformer-Keras-Tensorflow-Pytorch-Examples https://github.com/google-research/vision_transformer
  108. DeepLab-v3-plus Semantic Segmentation in TensorFlow https://github.com/rishizek/tensorflow-deeplab-v3-plus
  109. DEEP LEARNING METHODS FOR 3D:3D human pose estimation= 2D pose estimation + matching,Integral Human Pose Regression,Towards 3D Human Pose Estimation in the

Wild: a Weakly-supervised Approach,A Simple Yet Effective Baseline for 3d Human Pose Estimation,

  1. Data Augmentation apply to increase size of dataset and performance of model
  2. low code object detection - detecto https://github.com/alankbi/detecto
  3. AutoML https://github.com/dataloop-ai/AutoML
  4. Object Detection with 10 lines of code-https://www.datasciencecentral.com/profiles/blogs/object-detection-with-10-lines-of-code https://towardsdatascience.com/object-detection-with-10-lines-of-code-d6cb4d86f606
  5. Detecto https://github.com/alankbi/detecto https://medium.com/analytics-vidhya/computer-vision-in-healthcare-detection-of-fractures-3313fe6452fc
  6. OneNet-https://analyticsindiamag.com/onenet/
  7. Norfair https://github.com/tryolabs/norfair
  8. Remo Improves Image Management https://www.freecodecamp.org/news/manage-computer-vision-datasets-in-python-with-remo/
  9. yolo https://github.com/zzh8829/yolov3-tf2 https://github.com/ultralytics/yolov5 https://github.com/ashishpatel26/Yolov5-King-of-object-Detection https://github.com/sicara/tf2-yolov4
  10. clip https://github.com/openai/CLIP
  11. bayesian on CNN to reduce the overfitting and we can call CNN with applied Bayesian as a BayesianCNN https://analyticsindiamag.com/a-beginners-guide-to-bayesian-cnn/

3.Recurrent neural network (use when series of data)

  1. 1.RNN
  2. 2.GRU
  3. 3.LSTM (have memory cell,forget gate etc..)
  4. Depth Gated RNNs,Peephole connection,Coupled Input and Forget,Clockwork RNNs,RNN Initialized Using Identity Matrix(IRNN)
  5. 𝗧𝗲𝗺𝗽𝗼𝗿𝗮𝗹 𝗖𝗼𝗻𝘃𝗼𝗹𝘂𝘁𝗶𝗼𝗻𝗮𝗹 𝗡𝗲𝘁𝘄𝗼𝗿𝗸 better than LSTM/GRU https://github.com/ashishpatel26/tcn-keras-Examples
  6. 4.Information Discrimination Units (IDU) https://github.com/hjeun/idu
  7. Train an LSTM Model ~30x Faster Using PyTorch with GPU https://towardsdatascience.com/how-to-train-an-lstm-model-30x-faster-using-pytorch-with-gpu-e6bcd3134c86
  8. all above 3 models have bidirectional also based on problem statement use bidirectional models
  9. Quasi-Recurrent Neural Network https://github.com/salesforce/pytorch-qrnn
  10. textgenrnn https://github.com/minimaxir/textgenrnn

4.Generative adversarial network https://poloclub.github.io/ganlab/ https://developers.google.com/machine-learning/gan/training

  1. gan lab https://poloclub.github.io/ganlab/
  2. https://neptune.ai/blog/generative-adversarial-networks-gan-applications?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-generative-adversarial-networks-gan-applications
  3. Diffusion Models Beat GANs on Image Synthesis https://paperswithcode.com/paper/diffusion-models-beat-gans-on-image-synthesis?from=n9
  4. MUNIT: Multimodal Unsupervised Image-to-Image Translation (GAN)
  5. https://jonathan-hui.medium.com/gan-gan-series-2d279f906e7b
  6. generative adversarial transformers https://github.com/dorarad/gansformer
  7. LipGAN https://github.com/Rudrabha/LipGAN Wav2Lip https://github.com/Rudrabha/Wav2Lip
  8. BigGAN https://analyticsindiamag.com/hands-on-guide-to-biggan-with-python-code/
  9. Cycle gan,Big GAN Style GAN,Dcgan,cGAN,SRGAN,InfoGAN,stargan,attan gan,stylegan,,PixelRNN,StackGAN,DiscoGAN,lsGAN,Conditional GAN(Pix2Pix),Progressive GANs( produces higher resolution images,Image-to-Image Translation),Face Inpainting,Super-resolution,Progressive Growing GAN,Instance-Conditioned GAN,Wasserstein GAN(improve image generation),ChromaGan,GANsformers,Conditional GAN and Unconditional GAN,Least Square GAN,Auxilary Classifier GAN,Dual Video Discriminator GAN,SRGAN,StackGAN,CycleGAN,WGAN
  10. diffusion https://github.com/openai/guided-diffusion
  11. https://www.analyticsvidhya.com/blog/2021/05/progressive-growing-gan-progan/
  12. 5 Alternatives To Deep Nostalgia https://analyticsindiamag.com/top-5-alternatives-to-deep-nostalgia/
  13. MixNMatch https://github.com/Yuheng-Li/MixNMatch
  14. Quantum GAN https://analyticsindiamag.com/now-gans-are-being-used-for-drug-discovery-complete-guide-to-quantum-gan-with-python-code/
  15. https://analyticsindiamag.com/guide-to-differentiable-augmentation-for-data-efficient-gan-training/ https://analyticsindiamag.com/hands-on-python-guide-to-style-based-age-manipulation-sam-technique/
  16. Imaginaire https://analyticsindiamag.com/guide-to-nvidia-imaginaire-gan-library-in-python/
  17. Disentanglement https://analyticsindiamag.com/what-is-face-identity-disentanglement-and-how-it-outperformed-gans/
  18. StyleFlow https://github.com/RameenAbdal/StyleFlow
  19. https://github.com/hindupuravinash/the-gan-zoo https://analyticsindiamag.com/top-10-tools-for-generative-adversarial-networks/

5.Autoencoder

  1. 1.sparse Autoencoder
  2. 2.denoising Autoencoder
  3. 3.Contractive Autoencoder
  4. 4.stacked Autoencoder
  5. 5.deep Autoencoder
  6. 6.variational autoencoder
  7. 7.convolutional autoencoder
  8. Beta Variational Autoencoder,VAE with Linear Normalizing Flows ,VAE with Inverse Autoregressive Flows ,Disentangled Beta Variational Autoencoder,Disentangling by Factorising (FactorVAE),Beta-TC-VAE (BetaTCVAE),Importance Weighted Autoencoder (IWAE),VAE with perceptual metric similarity,Wasserstein Autoencoder (WAE),Info Variational Autoencoder,VAMP Autoencoder (VAMP),Hyperspherical VAE (SVAE),Adversarial Autoencoder (Adversarial_AE),Variational Autoencoder GAN (VAEGAN) ,Vector Quantized VAE (VQVAE),Hamiltonian VAE (HVAE),Regularized AE with L2 decoder param (RAE_L2),Regularized AE with gradient penalty (RAE_GP),Riemannian Hamiltonian VAE (RHVAE)
  9. https://github.com/zc8340311/RobustAutoencoder
  10. Applications of AutoEncoders,Dimensionality reduction,Anomaly detection,Image denoising,Image compression,Image generation

6.BoltzmannMachines,Restricted Boltzmann Machine,deep belief network,deep BoltzmannMachines

7.Self Organizing Maps (SOM) , Fast Self-Organizing Map https://github.com/nmarincic/numbasom,minisom https://github.com/JustGlowing/minisom

8.Natural language processing

  1. regex,PRegEx (https://github.com/manoss96/pregex)
  2. Clean data(removing stopwords depond on problem ,lowering data,tokenization,postagging,stemmimg or lemmatization depond on problem,skipgram,n-gram,chunking)
  3. clean text https://github.com/jfilter/clean-text
  4. Cleaning and Pre-processing textual data with NeatText library Automated NLP Pre-Processing using Data-Purifier Library https://github.com/Elysian01/Data-Purifier
  5. Nltk,spacy,genism,textblob,inltk,Indic NLP,StanfordNLP,Pattern,stanza,OpenNLP,polygot,corenlp,polyglot,PyDictionary,Huggiing face,spark nlp,allen nlp,rasa nlu,Megatron,texthero,Flair,textacy,finetune,gluon-nlp,VnCoreNLP,fasttext,Langid,PyCLD3,Guesslang,Parrot libraries
  6. keyword library Rake_NLTK, Spacy, Textrank, Word cloud, KeyBert, Yake, MonkeyLearn API and Textrazor API.
  7. jiant is an NLP toolkit https://github.com/nyu-mll/jiant
  8. clean-text https://github.com/jfilter/clean-text https://www.youtube.com/watch?v=i2TjAgga1YU
  9. indicnlp https://indicnlp.ai4bharat.org/samanantar/
  10. Augmenting Data For NLP Tasks https://towardsdatascience.com/tips-tricks-augmenting-data-for-nlp-tasks-983e33ad55a7 https://amitness.com/2020/05/data-augmentation-for-nlp/ https://github.com/makcedward/nlpaug https://towardsdatascience.com/data-augmentation-in-nlp-2801a34dfc28
  11. NLP Data Augmenting https://lnkd.in/eHa2cH6
  12. Text Data Augmentation in Natural Language Processing with Texattack https://www.analyticsvidhya.com/blog/2022/02/text-data-augmentation-in-natural-language-processing-with-texattack/
  13. Tagalog is our state-of-the-art solution for data management and labeling in Natural Language Processing https://www.tagalog.ai/tagalog/
  14. https://github.com/jasonwei20/eda_nlp https://github.com/dsfsi/textaugment https://github.com/QData/TextAttack https://github.com/makcedward/nlpaug
  15. nlp_profiler https://analyticsindiamag.com/complete-guide-on-nlp-profiler-python-tool-for-profiling-of-textual-dataset/
  16. doccano text annotation tool https://github.com/doccano/doccano https://www.youtube.com/watch?v=vT-GE_jssPk https://github.com/doccano/auto-labeling-pipeline https://github.com/doccano/doccano-client https://doccano.herokuapp.com/
  17. Data augmentation for NLP-https://github.com/makcedward/nlpaug
  18. Data Augmentation library for text nlpaug https://towardsdatascience.com/data-augmentation-library-for-text-9661736b13ff
  19. doccano,Parrot_Paraphraser,NLPAug,AugLy
  20. detext-https://github.com/linkedin/detext
  21. nlpaug-https://github.com/makcedward/nlpaug augmenty https://github.com/KennethEnevoldsen/augmenty
  22. NLP-progress -https://github.com/sebastianruder/NLP-progress
  23. Super Duper NLP Repo- https://notebooks.quantumstat.com/
  24. Multilingual Representations for Indian Languages https://tfhub.dev/google/MuRIL/1
  25. Natural Language Processing 365- https://ryanong.co.uk/natural-language-processing-365/
  26. 1 line for hundreds of NLP models and algorithms- https://github.com/JohnSnowLabs/nlu
  27. simpletransformers
  28. beautiful Wordclouds in Python https://towardsdatascience.com/how-to-easily-make-beautiful-wordclouds-in-python-55789102f6f5
  29. Automate your Text Processing workflow in a single line of Python Code https://towardsdatascience.com/automate-your-text-processing-workflow-in-a-single-line-of-python-code-e276755e45de
  30. quantumstat https://index.quantumstat.com/
  31. Dynaboard: Moving beyond accuracy to holistic model evaluation in NLP https://ai.facebook.com/blog/dynaboard-moving-beyond-accuracy-to-holistic-model-evaluation-in-nlp/
  32. gobbli for interactive NLP https://medium.com/rti-cds/using-gobbli-for-interactive-nlp-f60feb41e5cb
  33. AutoReg Regex of string in Python https://github.com/SusmitPanda/AutoReg
  34. Negation Handling Increasing Accuracy of Sentiment Classification
  35. NLU,NLG,NER,text summarization,Sentiment Analysis,Text Classifications,machine translation,chat bot,Text Generation,Speech Recognition
  36. Case Normalization,regex,Lowercasing,sent_tokenize,Tokenization,Remove Punctuations,Removing Stopwords,Removing Unicode,Removal of(Noise, URLs, Hashtag and User-mentions Hashtag),Replacing Emoticons,Removing Number,Correction of Spelling mistakes,Expanding Contractions,Removing Emojis,Convert Emoji,Remove Emoticon,Removing URLs,Hashtags,text normalization,Noise Removal,Punctuation,Spell Correction,Stemming or Lemmatization
  37. 1.One-hot-encoding,Index-based Encoding,Term Frequency,bag of words ,Bag of N-grams Model,Binary Term Frequency,(L1) Normalized Term Frequency,(L2) Normalized TF-IDF
  38. 2.Tfidf ,Weighted Class TF-IDF,tfidf + CHI²,HashingVectorizer
  39. 3.wordembedding : Use a pre-trained model , Self-Trained model
  40. a.using pretrained model
  41. i)word2vec( cbow,skipgram) ,AvgWord2vec
  42. ii)glove https://medium.com/spark-nlp/1-line-to-glove-word-embeddings-with-nlu-in-python-baed152fff4d
  43. iii)fast text
  44. iv)MetaVec
  45. b.creating own embedding (use when have huge data)
  46. i)word2vec library
  47. ii)keras embedding
  48. elmo (store semantic of word)
  49. Context-independent
  50. Context-independent without machine learning Bag-of-words,TF-IDF
  51. Context-independent with machine learning Word2vec (Bag of Words (CBoW) and Skip-Gram ) GloVe fastText
  52. Context-dependent
  53. Context-dependent and RNN based(elmo,cove)
  54. Context-dependent and transformer-based (BERT ,xlm,RoBERTa,ALBERT)
  55. contextual embeddings: AllenNLP ELMo, OpenAIs GPT,GPT1,GPT2,GPT3, and Googles BERT
  56. Fast_Sentence_Embeddings Compute Sentence Embeddings Fast https://github.com/oborchers/Fast_Sentence_Embeddings
  57. Universal Embeddings, Contextual Embeddings (Transformers),BERT Embeddings,Sentence Transformers,Sentence Vectors,Sentence Embedding
  58. Transformer based embedding
  59. 3 b Tokenizer nlp(texs_to_sequences )
  60. 4.Document embedding-Doc2vec
  61. 5.sentence embedding
  62. sense2vec,SENT2VEC,Universal sentence encoder,Sentence Transformers
  63. Top2Vec
  64. Topic Modelling https://towardsdatascience.com/april-edition-adventures-in-topic-modelling-7ee9081a48a0
  65. Doc2Vec Distributed memory model , Distributed bag of word,Node2Vec,Top2Vec,Doc2Vec,Item2Vec
  66. Elmo, BERT,Universal Sentence Encoder, Sentence Transformers
  67. 6.using rnn,lstm,gru
  68. Conventional RNN,Deep Transition RNN,DT(S)-RNN,DOT-RNN,Stacked RNN
  69. for above 3 models have bidirectional also
  70. textgenrnn generate text https://github.com/minimaxir/textgenrnn
  71. 7.Encoder and Decoder(sequence to sequence), ProphetNet(new pretrained seq2seq model)
  72. 8.attention
  73. self attention,Global Attention,Multi-Head Attention,Local Attention (monotonic,predictive),flash-attention,Fast and memory-efficient exact attention https://github.com/uzaymacar/attention-mechanisms
  74. Seq2seq with Attention,Self-attentionm,Multi-head Attention
  75. 9.Transformer (big breakthrough in NLP) - http://jalammar.github.io/illustrated-transformer/
  76. Build a Transformer in JAX from scratch https://theaisummer.com/jax-transformer/
  77. Trankit is a Light-Weight Transformer-based Python Toolkit for Multilingual Natural Language Processing https://github.com/nlp-uoregon/trankit
  78. FastFormers https://medium.com/ai-in-plain-english/fastformers-233x-faster-transformers-inference-on-cpu-4c0b7a720e1
  79. Shrinking Transformers (reduce size) 1.quantization,distillation,pruning,
  80. Reformer,Performers,vision transformer
  81. Reformer: The Efficient Transformer
  82. Longformer: The Long-Document Transformer
  83. ELECTRA: Pre-training Text Encoders as Discriminators Rather Than Generators
  84. DeLighT: Deep and Light-weight Transformer https://analyticsindiamag.com/complete-guide-to-delight-deep-and-light-weight-transformer/
  85. https://github.com/balavenkatesh3322/NLP-pretrained-model
  86. Tree-Transformer https://github.com/yaushian/Tree-Transformer
  87. Scalable Transformer-based Model https://analyticsindiamag.com/guide-to-perceiver-a-scalable-transformer-based-model/
  88. Transformers Interpret https://towardsdatascience.com/introducing-transformers-interpret-explainable-ai-for-transformers-890a403a9470 https://github.com/cdpierse/transformers-interpret https://analyticsindiamag.com/hands-on-guide-to-the-evolved-transformer-on-neural-machine-translation/
  89. Novel Interpretable Transformer https://github.com/hila-chefer/Transformer-Explainability https://analyticsindiamag.com/compute-relevancy-of-transformer-networks-via-novel-interpretable-transformer/
  90. https://www.kdnuggets.com/2021/02/hugging-face-transformer-basics.html#.YE7gRy9s-LA.linkedin
  91. mBART-50 https://www.youtube.com/watch?v=fxZtz0LPJLE&feature=youtu.be
  92. Few-shot classification with SetFit and a custom dataset https://rubrix.readthedocs.io/en/docs-setfit_tutorial/tutorials/few-shot-classification-with-setfit.html
  93. 10.BERT,Packed BERT,BART,DynaBERT,SBERT,ConvBert,Quantized MobileBERT,ALBERT,ELECTRA,ARBERT,MARBERTElectra,Transformer-XL,Longformer,Reformer,DistilBERT,ELMo,ROBERTA,XLNet,XLM-RoBERTa,DeBERTa,T5,fastT5, CodeT5,mT5,ByT5,simpleT5,byt5,OnnxT5,FastT5,Linformer,DISTILBERT,GPT,GPT2,GPT3,gpt-neo,gpt-neox,GPT-J,aitextgen,PRADO,PET,BORT,MuRIL,Multitask Unified Model,aitextgen,AI21's 'Jurassic' language model,Turing NLG,Wu Dao 2.0,PanGu-Alpha,Gopher,Megatron model
  94. https://neptune.ai/blog/bert-and-the-transformer-architecture-reshaping-the-ai-landscape
  95. gpt3 https://www.producthunt.com/posts/100-resources-on-gpt-3
  96. Graph4NLP https://dlg4nlp.github.io/index.html
  97. Feedback Transformers from Facebook AI https://towardsdatascience.com/feedback-transformers-from-facebook-ai-221c5dd09e3f
  98. DETR https://analyticsindiamag.com/how-to-detect-objects-with-detection-transformers/ https://github.com/dddzg/up-detr
  99. DeiT https://analyticsindiamag.com/introducing-deit-data-efficient-image-transformers/ https://github.com/facebookresearch/deit
  100. 80+ NLP tasks https://medium.com/innerdoc/80-natural-language-processing-tasks-described-c777bc4974b3
  101. Text-to-Image https://www.datasciencecentral.com/profiles/blogs/summarizing-popular-text-to-image-synthesis-methods-with-python
  102. NLP: Pre-trained Sentiment Analysis https://medium.com/@b.terryjack/nlp-pre-trained-sentiment-analysis-1eb52a9d742c
  103. Awesome-NLP-Resources -https://github.com/Robofied/Awesome-NLP-Resources https://shivanandroy.com/awesome-nlp-resources/ https://github.com/keon/awesome-nlp
  104. 10 Popular Keyword Extraction Algorithms in Natural Language Processing https://prakhar-mishra.medium.com/10-popular-keyword-extraction-algorithms-in-natural-language-processing-8975ada5750c
  105. https://medium.com/@jatinmandav3/opinion-mining-sometimes-known-as-sentiment-analysis-or-emotion-ai-refers-to-the-use-of-natural-874f369194c0#:~:text=fastText%20is%20a%20library%20for,pretrained%20models%20for%20294%20languages
  106. https://analyticsindiamag.com/top-ten-bert-alternatives-for-nlu-projects/ https://towardsdatascience.com/from-pre-trained-word-embeddings-to-pre-trained-language-models-focus-on-bert-343815627598
  107. GPT2 generated Indian Food Recipes https://www.kaggle.com/nulldata/gpt2-generated-indian-food-recipes
  108. http://jalammar.github.io/ http://jalammar.github.io/illustrated-bert/ http://jalammar.github.io/a-visual-guide-to-using-bert-for-the-first-time/
  109. https://jalammar.github.io/explaining-transformers/ https://jalammar.github.io/hidden-states/
  110. https://www.kdnuggets.com/2019/09/bert-roberta-distilbert-xlnet-one-use.html
  111. 11.Speech (Braina,Dragon Speech Recognition Solutions ,Winscribe,Gboard,Windows 10 Speech Recognition,Otter,Speechnotes,tts,OpenSpeech,FRILL,Vakyansh)
  112. audio data augmentation https://github.com/iver56/audiomentations
  113. speech to text
  114. text to speech https://towardsdatascience.com/text-to-speech-one-small-step-by-mankind-to-create-lifelike-robots-54e19f843b21
  115. Acoustic model,Speaker diarisation,apis,apiai,assemblyai,google-cloud-speech,pocketsphinx,SpeechRecognition,watson-developer-cloud,wit,Coqui TTS,Mozilla TTS, OpenTTS,ESPNet,PaddleSpeech,Wav2Vec, Whisper, DeepSpeech,Eesen,TensorFlowASR,Vosk,CMUSphinx,Pocketsphinx,KoNLPy,Madmom,HTK,Pysptk,Tortoise TTS,Bark,Musicgen,Riffusion
  116. Microsoft IceCAPS is an Open Source Framework for Conversational Modeling https://pub.towardsai.net/microsoft-icecaps-is-an-open-source-framework-for-conversational-modeling-4f78492ca685
  117. State-of-the-art Approaches to Building Open-Domain Conversational Agents https://www.topbots.com/conversational-ai-open-domain-chatbots/?utm_source=twitter&utm_medium=company_post&utm_campaign=conversational_open_domain_chatbots
  118. LaMDA: our breakthrough conversation technology https://www.blog.google/technology/ai/lamda
  119. assemblyai https://www.assemblyai.com/
  120. bark https://github.com/suno-ai/bark
  121. SpeechBrain A PyTorch Powered Speech Toolkit https://speechbrain.github.io/ https://github.com/speechbrain/speechbrain
  122. Wav2vec-U learns to recognize #speech from unlabeled data https://venturebeat.com/2021/05/21/facebook-wav2vec-u-learns-to-recognize-speech-from-unlabeled-data/?utm_source=dlvr.it&utm_medium=linkedin
  123. Wav2Vec2 https://huggingface.co/transformers/model_doc/wav2vec2.html https://www.youtube.com/watch?v=dJAoK5zK36M&feature=youtu.be
  124. SincNet is a neural architecture for efficiently processing raw audio samples https://github.com/mravanelli/SincNet
  125. HuggingFace Transformers ASR https://github.com/dennisbakhuis/Ecare_Brunch_ASR
  126. English speech recognition https://github.com/openai/whisper
  127. https://github.com/balavenkatesh3322/audio-pretrained-model
  128. SpeechRecognition ASR2K: Speech Recognition https://github.com/xinjli/asr2k
  129. audiomentations Python library for audio data augmentation https://github.com/iver56/audiomentations
  130. googletrans (google Translator) https://pypi.org/project/googletrans/
  131. lang-identification Google Compact Language Detector,FastText
  132. 𝗴𝗧𝗧𝗦 for text to speech conversion , 𝘀𝗽𝗲𝗲𝗰𝗵_𝗿𝗲𝗰𝗼𝗴𝗻𝗶𝘁𝗶𝗼𝗻,TTS
  133. Python/Pytorch app for easily synthesising human voices https://github.com/BenAAndrew/Voice-Cloning-App
  134. Speech-Transformer-tf2.0 https://github.com/xingchensong/Speech-Transformer-tf2.0
  135. The Super Duper NLP Repo https://notebooks.quantumstat.com/
  136. ecco https://github.com/jalammar/ecco https://www.eccox.io/ https://www.youtube.com/watch?v=rHrItfNeuh0&feature=youtu.be
  137. Language Interpretability Tool (LIT) is an open-source platform for visualization and understanding of NLP models https://pair-code.github.io/lit/
  138. Language Interpretability Tool https://github.com/pair-code/lit https://ai.googleblog.com/2020/11/the-language-interpretability-tool-lit.html
  139. autonlp https://analyticsindiamag.com/hands-on-guide-to-using-autonlp-for-automating-sentiment-analysis/
  140. https://medium.com/towards-artificial-intelligence/natural-language-processing-nlp-with-python-tutorial-for-beginners-1f54e610a1a0
  141. https://pakodas.substack.com/p/neural-search-on-indian-languages
  142. https://www.linkedin.com/pulse/natural-language-processing-2020-year-review-ivan-bilan/?trackingId=CYfd1ZyLStu6x09tjVIoGw%3D%3D
  143. ConvBert https://github.com/yitu-opensource/ConvBert
  144. Python interface for building, loading, and using GloVe vectors https://github.com/Lguyogiro/pyglove
  145. SentenceTransformers https://www.sbert.net/
  146. Reformer – The Efficient Transformer https://analyticsindiamag.com/hands-on-guide-to-reformer-the-efficient-transformer/
  147. Funnel-Transformer https://github.com/laiguokun/Funnel-Transformer
  148. CLIP – Connecting Text To Images https://analyticsindiamag.com/hands-on-guide-to-openais-clip-connecting-text-to-images/
  149. Topic Modeling in One Line with Top2Vec https://towardsdatascience.com/topic-modeling-in-one-line-with-top2vec-a413991aa0ef
  150. MT5-https://venturebeat.com/2020/10/26/google-open-sources-mt5-a-multilingual-model-trained-on-over-101-languages/?utm_content=144321587&utm_medium=social&utm_source=linkedin&hss_channel=lcp-3740012
  151. VADER does not require any training data https://pypi.org/project/vaderSentiment/ https://analyticsindiamag.com/sentiment-analysis-made-easy-using-vader/
  152. APPLICATIONS OF MACHINE TRANSLATIO-Text-to-text,Text-to-speech,Speech-to-text,Speech-to-speech,Image (of words)-to-text
  153. Google-GNMT (Tensorflow),Facebook-fairseq (Torch),Amazon-Sockeye (MXNet),NEMATUS (Theano),THUMT (Theano),OpenNMT (PyTorch),StanfordNMT (Matlab),DyNet-lamtram(CMU),EUREKA(MangoNMT
  154. awesome-gpt3 https://github.com/elyase/awesome-gpt3
  155. Robustness Gym: Evaluation Toolkit for NLP https://github.com/robustness-gym/robustness-gym
  156. https://analyticsindiamag.com/best-nlp-based-seo-tools-for-2021/ https://towardsdatascience.com/5-nlp-models-that-you-need-to-know-about-754594a3225b
  157. https://www.kdnuggets.com/2020/05/best-nlp-deep-learning-course-free.html https://analyticsindiamag.com/flair-hands-on-guide-to-robust-nlp-framework-built-upon-pytorch/
  158. https://medium.com/modern-nlp/nlp-metablog-a-blog-of-blogs-693e3a8f1e0c
  159. summarization https://github.com/hyunwoongko/summarizers ctrl-sum https://github.com/salesforce/ctrl-sum

classification,clustering,recommender systems,topic modelling,sentiment analysis,semantic analysis,summarization,machine translation,conversational interface,named entity recognition

F.Time Series Hands-On Guide To Atspy For Automating The Time-Series Forecasting https://github.com/Apress/hands-on-time-series-analylsis-python

  1. here data split is different (train,test,validate)
  2. here handling missing data different
  3. Time Series Decomposition In Python trend, seasonality,Cyclical and noise https://towardsdatascience.com/time-series-decomposition-in-python-8acac385a5b2
  4. Removing trend Differencing,Least square trends removal
  5. Converting Non- stationary into stationary Detrending,Differencing,Transformation
  6. Time Series Decomposition log,box-cox transformation,moving average
  7. Removing seasonality Seasonal differencing,Seasonal means,Method of moving averages
  8. generally used to impute data in Time Series
  9. 1.ffill
  10. 2.bfill
  11. 3.do mean of previous or future x samples and impute
  12. 4.take previous season value and impute (data with trend)
  13. 5.mean,mode,median,random sample imputation (data without trend and without seasonality)
  14. 6.linear interpolation(data with trend and without seasonality)
  15. 7.seasonal +interpolation(data with trend and with seasonality)
  16. here model selection deponds on different property of data like stationary,trend,seasonality,cyclic
  17. Anomaly Detection using Isolation Forest,AutoEncoders
  18. Granger Causality Statistical Test use for variable usable for forecast
  19. adfuller test for Stationarity Non Stationary Statistical Test - KPSS and ADF ACF, PACF, decomposition, ADF test
  20. Handling Data with Regular Gaps using Facebook Prophet
  21. models
  22. 1.AR,VR, VAR, MA, ARMA, ARIMA, auto arima(pmd arima) ,seasonal arima(SARIMA),SARIMAX models
  23. 2.Autoregressive,Vector Autoregression,Vector Autoregression Moving-Average,Vector Autoregression Moving-Average with Exogenous Regressors
  24. 3.Moving average,Exponential Moving average,Exponential Smoothing,Simple average, Holts linear trend method, Holts Winter seasonal method,DeepAR,N-BEATS
  25. 11 Classical Time Series Forecasting Methods in Python https://machinelearningmastery.com/time-series-forecasting-methods-in-python-cheat-sheet/
  26. 4.XGBoost,Lstm(neural network),DeepAR ( An RNN Algorithm)
  27. 5.GARCH
  28. atspy Automated time-series models
  29. 6.Navie forecasts
  30. 7.Smoothing (moving average,exponential smoothing)
  31. 8.Facebook prophet (note:expceted date column as ds and target column as y) https://thecleverprogrammer.com/2020/12/14/facebook-prophet-model-with-python/
  32. NeuralProphet Model- https://ourownstory.github.io/neural_prophet/model-overview/ https://thecleverprogrammer.com/2021/01/28/neuralprophet-model-with-python/
  33. bulbea Deep Learning based Python Library for Stock Market Prediction and Modelling https://github.com/achillesrasquinha/bulbea
  34. PyTorch Forecasting enables deep learning models for time-series forecasting pytorch-ts https://github.com/zalandoresearch/pytorch-ts
  35. ETSformer-pytorch https://github.com/lucidrains/ETSformer-pytorch
  36. Transformer Networks to build a Forecasting model https://towardsdatascience.com/how-to-use-transformer-networks-to-build-a-forecasting-model-297f9270e630
  37. Temporal Fusion Transformer (By Google)
  38. hmmlearn https://github.com/ushareng/StockPricePredictionUsingHMM_Byte/blob/master/StockPricePredictionUsingHMM.ipynb
  39. pyramid-arima https://github.com/tgsmith61591/pyramid
  40. pyflux: time series library: https://github.com/RJT1990/pyflux
  41. orbit https://eng.uber.com/orbit/
  42. greykite A flexible, intuitive and fast forecasting library https://github.com/linkedin/greykite https://www.analyticsvidhya.com/blog/2021/05/greykite-time-series-forecasting-in-python/
  43. Silverkite
  44. LinkedIn open-sources Greykite, a library for time series forecasting https://github.com/linkedin/greykite/stargazers
  45. stumpy https://github.com/TDAmeritrade/stumpy
  46. Giotto-Time Time-Series Forecasting Python Library https://github.com/giotto-ai/giotto-time https://analyticsindiamag.com/guide-to-giotto-time-a-time-series-forecasting-python-library/
  47. Informer (for Long Sequence Time-Series Forecasting) https://analyticsindiamag.com/informer/
  48. tfcausalimpact https://github.com/WillianFuks/tfcausalimpact
  49. deepar is global model https://www.youtube.com/watch?v=xcbj0RE3kfI&list=PL3N9eeOlCrP5cK0QRQxeJd6GrQvhAtpBK&index=14
  50. pmdarima for Auto ARIMA
  51. GluonTS https://github.com/awslabs/gluon-ts
  52. sktime a unified time-series framework for Scikit-Learn
  53. tsfresh a magical library for feature extraction in time-series datasets
  54. ThymeBoost Forecasting with Gradient Boosted Time Series Decomposition https://github.com/tblume1992/ThymeBoost
  55. darts A python library for easy manipulation and forecasting of time series https://github.com/unit8co/darts
  56. Kats https://github.com/facebookresearch/Kats
  57. Time Series Outlier Detection with ThymeBoost
  58. AtsPy: Automated Time Series Models in Python https://github.com/firmai/atspy
  59. Merlion: A Machine Learning Framework for Time Series Intelligence https://github.com/salesforce/Merlion
  60. stumpy powerful and scalable Python library for modern time series analysis https://github.com/TDAmeritrade/stumpy
  61. mlforecast Scalable machine learning based time series forecasting https://github.com/Nixtla/mlforecast
  62. statsforecast Lightning ⚡️ fast forecasting with statistical and econometric models https://github.com/Nixtla/statsforecast
  63. 9.Holts winter,Holts linear trend
  64. 10.Auto_Timeseries by auto-ts https://www.youtube.com/watch?v=URUiVD37fns&list=PL3N9eeOlCrP5cK0QRQxeJd6GrQvhAtpBK&index=24 tell best model for data
  65. AutoTS-https://analyticsindiamag.com/hands-on-guide-to-autots-effective-model-selection-for-multiple-time-series/ https://github.com/AutoViML/Auto_TS
  66. Automated Time Series Forecasting https://github.com/winedarksea/AutoTS , No-Code AI Forecasting Platform https://datafloat.ai/
  67. AutoML for time series: advanced approaches with FEDOT framework https://towardsdatascience.com/automl-for-time-series-advanced-approaches-with-fedot-framework-4f9d8ea3382c
  68. AutoML for time series: definitely a good idea https://towardsdatascience.com/automl-for-time-series-definitely-a-good-idea-c51d39b2b3f
  69. AutoReg Regex of string in Python https://github.com/SusmitPanda/AutoReg
  70. pytsal low-code open-source python framework for Time Series analysis,visualization,forecasting along with AutoTS https://github.com/KrishnanSG/pytsal
  71. Automated Time Series Forecasting https://github.com/winedarksea/AutoTS
  72. Forecasting with H2O AutoML https://github.com/business-science/modeltime.h2o/
  73. Forecasting Stock Prices Using Stocker https://medium.com/mlearning-ai/forecasting-stock-prices-using-stocker-7d2ac15966f5
  74. MiniRocket: Fast(er) and Accurate Time Series Classification https://towardsdatascience.com/minirocket-fast-er-and-accurate-time-series-classification-cdacca2dcbfa
  75. modeltime https://github.com/business-science/modeltime
  76. GluonTS , PytorchTS https://analyticsindiamag.com/gluonts-pytorchts-for-time-series-forecasting/
  77. stocker https://medium.datadriveninvestor.com/forecasting-stock-prices-using-stocker-66503c26307a
  78. 11.Temporal Convolutional Neural
  79. 12.Atspy For Automating The Time-Series Forecasting-https://analyticsindiamag.com/hands-on-guide-to-atspy-for-automating-the-time-series-forecasting/
  80. 13.Darts-https://analyticsindiamag.com/hands-on-guide-to-darts-a-python-tool-for-time-series-forecasting/
  81. 14.Bayesian Neural Network , TsEuler
  82. 15.PyFlux (easy way to compare different models)-https://analyticsindiamag.com/pyflux-guide-python-library-for-time-series-analysis-and-prediction/
  83. 16.Orbit , DeepAR ,NeuralProphet(https://github.com/ourownstory/neural_prophet https://ourownstory.github.io/neural_prophet/model-overview/)
  84. IBMs AutoAI automates time series forecasting https://www.ibm.com/blogs/research/2021/03/autoai-time-series/?utm_campaign=Learning%20Posts&utm_content=159454790&utm_medium=social&utm_source=twitter&hss_channel=tw-3018841323
  85. Kats all in 1 time seres data https://github.com/facebookresearch/kats https://facebookresearch.github.io/Kats/
  86. orbit https://analyticsindiamag.com/hands-on-guide-to-orbit-ubers-python-framework-for-bayesian-forecasting-inference/ https://github.com/uber/orbit
  87. best article-https://www.analyticsvidhya.com/blog/2018/02/time-series-forecasting-methods/
  88. TimeSynth https://github.com/TimeSynth/TimeSynth https://analyticsindiamag.com/guide-to-timesynth-a-python-library-for-synthetic-time-series-generation/
  89. time series visualization tool https://plotjuggler.io/
  90. Time Series Anomaly Detection using Generative Adversarial Networks(TadGAN) https://analyticsindiamag.com/hands-on-guide-to-tadgan-with-python-codes/
  91. fastquant Backtest and optimize your trading strategies with only 3 lines of code https://github.com/enzoampil/fastquant
  92. pytorch-forecasting https://github.com/jdb78/pytorch-forecasting https://analyticsindiamag.com/guide-to-pytorch-time-series-forecasting/
  93. https://pytorch-forecasting.readthedocs.io/en/latest/ https://pytorch-forecasting.readthedocs.io/en/latest/tutorials/ar.html
  94. Complex Exponential Smoothing (CES) which can handle both stationary and non-stationary processes and model a wide spectum of level and trend time-series. https://github.com/Nixtla/statsforecast/tree/main/experiments/ces
  95. sktime-https://github.com/alan-turing-institute/sktime https://analyticsindiamag.com/sktime-library/
  96. autocast https://github.com/andyzoujm/autocast
  97. tsfresh a magical library for feature extraction in time-series datasets.
  98. atspy https://github.com/firmai/atspy
  99. tcn https://towardsdatascience.com/farewell-rnns-welcome-tcns-dd76674707c8
  100. Pastas https://analyticsindiamag.com/guide-to-pastas-a-python-framework-for-hydrogeological-time-series-analysis/ https://github.com/pastas/pastas
  101. stockDL https://github.com/ashishpapanai/stockDL
  102. decompsition https://towardsdatascience.com/time-series-decomposition-in-python-8acac385a5b2
  103. Bayesian Diffusion Modeling https://www.topbots.com/bayesian-diffusion-modeling/
  104. Top 10 Python Tools For Time Series Analysis https://analyticsindiamag.com/top-10-python-tools-for-time-series-analysis/
  105. fine Tune Your Machine Learning Models To Improve Forecasting Accuracy https://www.kdnuggets.com/2019/01/fine-tune-machine-learning-models-forecasting.html
  106. add extra features https://towardsdatascience.com/the-demand-sales-forecast-technique-every-data-scientist-should-be-using-to-reduce-error-1c6f25add9cb
  107. https://machinelearningmastery.com/time-series-forecasting-methods-in-python-cheat-sheet/
  108. https://www.machinelearningplus.com/time-series/time-series-analysis-python/ https://www.datasciencecentral.com/profiles/blogs/list-of-time-series-methods-in-one-picture
  109. https://github.com/Apress/hands-on-time-series-analylsis-python
  110. https://otexts.com/fpp2/simple-methods.html
  111. https://analyticsindiamag.com/top-time-series-deep-learning-methods/
  112. book https://otexts.com/fpp2/

deep_autoviml Build tensorflow keras model pipelines in a single line of code https://github.com/AutoViML/deep_autoviml

G.𝐆𝐫𝐚𝐩𝐡 𝐍𝐞𝐮𝐫𝐚𝐥 𝐍𝐞𝐭𝐰𝐨𝐫𝐤𝐬

  1. Spatial-temporal graph neural networks,Structural Deep Network Embedding,Convolutional Graph Neural Network,GraphSAGE,Graph convolutional recurrent network,Diffusion convolutional recurrent neural network,Graph LSTM,Graph Autoencoders,Variational Graph Auto-Encoders,Graph Attention Networks

G.Semi supervised learning,Self-Supervised Learning,Multi-Instance Learning

self-training meta-estimator for semi-supervised learning

skweak: A Python Toolkit For Applying Weak Supervision To NLP Tasks https://analyticsindiamag.com/meet-skweak-a-python-toolkit-for-applying-weak-supervision-to-nlp-tasks/

10 Self-Supervised Learning Frameworks & Libraries To Use In 2021 analyticsindiamag.com/10-self-supervised-learning-frameworks-libraries-to-use-in-2021/

Self-Supervised Learning https://github.com/jason718/awesome-self-supervised-learning

OpenMMLab Self-Supervised Learning https://github.com/open-mmlab/mmselfsup

awesome-self-supervised-learning https://github.com/jason718/awesome-self-supervised-learning

Self-supervised Video Object Segmentation https://charigyang.github.io/motiongroup/

lightly A python library for self-supervised learning on images https://github.com/lightly-ai/lightly

Weak Supervision: The Art Of Training ML Models From Noisy Data https://analyticsindiamag.com/weak-supervision-the-art-of-training-ml-models-from-noisy-data/

snorkel and skweak, are there other libraries to explore for weak supervision in NLP

8 Resources To Learn Self-Supervised Learning In 2021 https://analyticsindiamag.com/top-8-resources-to-learn-self-supervised-learning-in-2021/

Barlow Twins: Self-Supervised Learning via Redundancy Reduction https://analyticsindiamag.com/a-guide-to-barlow-twins-self-supervised-learning-via-redundancy-reduction/ https://github.com/facebookresearch/barlowtwins

skweak: A Python Toolkit For Applying Weak Supervision To NLP Tasks https://analyticsindiamag.com/meet-skweak-a-python-toolkit-for-applying-weak-supervision-to-nlp-tasks/

H.Active learning,Multi-Task Learning,Online Learning

Active Learning Frameworks https://towardsdatascience.com/a-summary-of-active-learning-frameworks-3165159baae9

Meta Learning https://github.com/sudharsan13296/Awesome-Meta-Learning

Avalanche: A Python Library for Continual Learning https://analyticsindiamag.com/avalanche-a-python-library-for-continual-learning/

Reptile (OpenAI’s Latest Meta-Learning Algorithm) https://github.com/openai/supervised-reptile https://analyticsindiamag.com/reptile-openais-latest-meta-learning-algorithm/

Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks” https://github.com/cbfinn/maml

I.Transfer learning(Inductive Transfer learning(similar domain,different task),Unsupervised Transfer Learning(different task,different domain but similar enough) ,Transductive Transfer Learning(similar task,different domain)),Inductive transfer learning(labeled data is the same for the target and source domain but the tasks the model works on are different),Unsupervised transfer learning(unsupervised tasks for both source and target tasks),self taught learning,Homogeneous Transfer Learning,Heterogenous Transfer Learning

Transfer Learning Using TensorFlow Keras https://analyticsindiamag.com/transfer-learning-using-tensorflow-keras/

https://github.com/artix41/awesome-transfer-learning

https://towardsdatascience.com/a-comprehensive-hands-on-guide-to-transfer-learning-with-real-world-applications-in-deep-learning-212bf3b2f27a

J.Deep dream,Style transfer

K.One-shot learning,Zero-shot learning

l.Incremental Training https://blog.rasa.com/rasa-new-incremental-training/

https://github.com/ChristosChristofidis/awesome-deep-learning

101 Machine Learning Algorithms for Data Science with Cheat Sheets https://blog-datasciencedojo-com.cdn.ampproject.org/c/s/blog.datasciencedojo.com/machine-learning-algorithms/amp/

TYPES OF ACTIVATION FUNCTIONS: LINEAR ACTIVATION,RELU,LEAKY RELU,GELU,Parameterized ReLU,Shifted ReLU, Noisy ReLU,SIGMOID ACTIVATION,TANH ACTIVATION,elu,PReLU,Modifying ReLU,Shifted ReLU,Softmax,Swish,Softplus,Mish,Smooth reLU,GELU,Swish,Elliot

Optimizer- Gradient Descent(Batch Gradient Descent,Stochastic Gradient Descent,Mini batch Gradient Descent),sgd with momentum,Adagrad,RMSProp,AMSGrad,Adam,AdaBelief,MADGRAD,Nero,

https://analyticsindiamag.com/ultimate-guide-to-pytorch-optimizers/ https://analyticsindiamag.com/guide-to-tensorflow-keras-optimizers/

Regularization- L1, L2,elasticnet, dropout, early stopping, and data augmentation,batch normalisation,Layer Normalization,Group Normalization,tree purning,DropBlock,DropConnect,Learning rate schedulingWeight Decay,Gradient clipping,Adaptive optimizer

Addressing Overfitting - 13 Methods

  1. Dimensionality Reduction
  2. Feature Selection
  3. Early Stopping
  4. K-Fold Cross-Validation
  5. Creating Ensembles
  6. Pre‐Pruning
  7. Post‐Pruning
  8. Noise Regularization
  9. Dropout Regularization
  10. L1 and L2 Regularization
  11. Data (Image) Augmentation
  12. Adding More Training Data
  13. Reducing Network Width & Depth

DropBlock: A New Regularization Technique https://pub.towardsai.net/dropblock-a-new-regularization-technique-e926bbc74adb

Learning rate scheduling (Learning rate finder),Weight Decay,Gradient clipping,Cyclic Learning Rate

weight initialization Normal Distribution,initialized to the same value,Xavier Initialization,He Norm Initialization,

Different Normalization Layers - https://towardsdatascience.com/different-normalization-layers-in-deep-learning-1a7214ff71d6

Hyperparameters Number of hidden layers,Dropout,activation function,Weights initialization , learning rate,epoch, iterations and batch size

DropBlock-Keras-Implementation https://github.com/iantimmis/DropBlock-Keras-Implementation https://github.com/miguelvr/dropblock https://github.com/DHZS/tf-dropblock

standard dropout,early dropout,late dropout

Hyperparameter tuning

  1. https://analyticsindiamag.com/top-8-approaches-for-tuning-hyperparameters-of-machine-learning-models/ https://analyticsindiamag.com/top-10-open-source-hyperparameter-optimisation-libraries-for-ml-models/
  2. https://github.com/balavenkatesh3322/hyperparameter_tuning
  3. A.manual search
  4. a.GridSearchCV (check every given parameter so take long time),TuneGridSearchCV
  5. HalvingGridSearch https://towardsdatascience.com/11-times-faster-hyperparameter-tuning-with-halvinggridsearch-232ed0160155 https://towardsdatascience.com/faster-hyperparameter-tuning-with-scikit-learn-71aa76d06f12
  6. tune-sklearn https://github.com/ray-project/tune-sklearn (TuneGridSearchCV)
  7. b.RandomizedSearchCV (search randomly narrow down our time) with Scikit-learn, Scikit-Optimize,Hyperopt,TuneSearchCV
  8. HalvingRandomSearchCV
  9. c.Optuna,Hyperopt,Scikit-optimize,Keras Tuner,Ray-tune,Bayesian Optimization,Bayesian Optimization with Gaussian Processes (BO-GP),Bayesian Optimization with Tree-structured Parzen Estimator (BO-TPE),Particle swarm optimization (PSO),Genetic algorithm (GA)Hyperopt,bayes search,Hyperband and BOHB,HyperOpt-Sklearn,Bayes Search,Scikit Optimize,TPE,Multivariate TPE,HyperBand,Bayesian Optimization,exhaustive search, heuristic search,multi-fidelity optimization,NNI,DEAP,OptFormer,hgboost,Hyperopt,Sklearn-genetic,GPyOpt,pyGPGO,Mango,mlmachine,Polyaxon,BayesianOptimization,Talos,SHERPA,Scikit-Optimize,GPyOpt,SMAC, Simulated annealing (SA),Genetic algorithms (GAs),Particle swarm optimization (PSO),Population-Based Training (PBT),Multi-Fidelity Optimization,DEAP,SMAC,Ray Tune,Googles Vizer, Microsofts NNI,Keras tuner,BayesianOptimization,GPyOpt,SigOpt
  10. Bayesian Optimization: https://github.com/fmfn/BayesianOptimization
  11. Scikit Optimize: https://github.com/scikit-optimize/scikit-optimize
  12. Pyro: https://github.com/pyro-ppl/pyro
  13. BoTorch: https://github.com/pytorch/botorch
  14. RBFOpt library for black-box optimization https://github.com/coin-or/rbfopt
  15. Bayesian search with Gaussian processes,bayesian search with Random Forests,Bayesian search with GBMs
  16. Bayesian Optimization Using BoTorch https://analyticsindiamag.com/guide-to-bayesian-optimization-using-botorch/
  17. hyperparameter optimization https://github.com/LiYangHart/Hyperparameter-Optimization-of-Machine-Learning-Algorithms
  18. Hyperopt hyperas https://www.kdnuggets.com/2018/12/keras-hyperparameter-tuning-google-colab-hyperas.html
  19. hyperopt http://hyperopt.github.io/hyperopt/
  20. hypertune-using-scikit-optimize BayesSearchCV
  21. HpBandSter https://github.com/automl/HpBandSter hpsklearn https://medium.com/mlearning-ai/automatic-hyperparameter-optimization-6a1692c2ebee
  22. hypopt https://github.com/cgnorthcutt/hypopt https://medium.com/mlearning-ai/automatic-hyperparameter-optimization-6a1692c2ebee
  23. HiPlot https://analyticsindiamag.com/this-new-tool-helps-developers-in-effective-hyperparameter-tuning/
  24. botorch Bayesian optimization https://github.com/pytorch/botorch
  25. OCTIS https://github.com/mind-lab/octis
  26. hyperband https://neptune.ai/blog/hyperband-and-bohb-understanding-state-of-the-art-hyperparameter-optimization-algorithms
  27. Spearmint https://github.com/JasperSnoek/spearmint/
  28. tuun Hyperparameter tuning via uncertainty modeling https://github.com/petuum/tuun
  29. tune-sklearn https://github.com/ray-project/tune-sklearn/
  30. NeuPy http://neupy.com/2016/12/17/hyperparameter_optimization_for_neural_networks.html#id24
  31. Vizier
  32. ConfigSpace https://automl.github.io/ConfigSpace/master/ https://towardsdatascience.com/tuning-xgboost-with-xgboost-writing-your-own-hyper-parameters-optimization-engine-a593498b5fba
  33. NatureInspiredSearchCV https://github.com/timzatko/Sklearn-Nature-Inspired-Algorithms
  34. d.Sequential Model Based Optimization(Tuning a scikit-learn estimator with skopt)
  35. e.Optuna https://analyticsindiamag.com/hands-on-python-guide-to-optuna-a-new-hyperparameter-optimization-tool/
  36. f.Genetic Algorithms,Gradient-based optimization
  37. darwin-mendel Genetic Algorithm for Hyper-Parameter Tuning https://manishagrawal-datascience.medium.com/genetic-algorithm-for-hyper-parameter-tuning-1ca29b201c08
  38. g.Keras tuner (Random Search Keras Tuner,HyperBand Keras Tuner,Bayesian Optimization Keras Tuner,Hyperas ) https://sukanyabag.medium.com/automated-hyperparameter-tuning-with-keras-tuner-and-tensorflow-2-0-31ec83f08a62
  39. Keras Hyperparameter Tuning with aisaratuners Library https://aisaradeepwadi.medium.com/advance-keras-hyperparameter-tuning-with-aisaratuners-library-78c488ab4d6a
  40. hyperas Automating Hyperparameter Tuning of Keras Model https://github.com/maxpumperla/hyperas
  41. storm-tuner https://github.com/ben-arnao/StoRM https://medium.com/geekculture/finding-best-hyper-parameters-for-deep-learning-model-4df7a17546c2
  42. Hyperas https://towardsdatascience.com/automating-hyperparameter-tuning-of-keras-model-4fe69b8dedee
  43. hyperopt-sklearn https://github.com/hyperopt/hyperopt-sklearn
  44. Deep AutoViML https://github.com/AutoViML/deep_autoviml
  45. h.Scikit-Optimize,Optuna,Hyperopt,Multi-fidelity Optimization,Gradient-based optimization,Evolutionary optimization,Population-based,Bayes Search
  46. Scikit-Optimize library comes with BayesSearchCV implementation
  47. mle-hyperopt Lightweight Hyperparameter Optimization Tool https://github.com/mle-infrastructure/mle-hyperopt
  48. h.Hyperparameter Optimization https://github.com/awslabs/syne-tune
  49. i.ray[tune] and aisaratuners https://towardsdatascience.com/choosing-a-hyperparameter-tuning-library-ray-tune-or-aisaratuners-b707b175c1d7
  50. raytune https://docs.ray.io/en/master/tune/index.html https://docs.ray.io/en/latest/tune/index.html
  51. k.model_search https://github.com/google/model_search https://analyticsindiamag.com/hands-on-guide-to-model-search-a-tensorflow-based-framework-for-automl/
  52. Optimize machine learning models https://www.tensorflow.org/model_optimization
  53. Milano https://github.com/NVIDIA/Milano
  54. Tree-structured Parzen Estimators - TPE , TPE with Hyperopt
  55. Hyperparameter Tuning with the HParams Dashboard
  56. baytune https://www.kdnuggets.com/2021/03/automating-machine-learning-model-optimization.html
  57. Dragonfly https://analyticsindiamag.com/guide-to-scalable-and-robust-bayesian-optimization-with-dragonfly/
  58. Pywedge https://www.analyticsvidhya.com/blog/2021/02/interactive-widget-based-hyperparameter-tuning-and-tracking-in-pywedge/
  59. CapsNet Hyperparameter Tuning with Keras https://towardsdatascience.com/scikeras-tutorial-a-multi-input-multi-output-wrapper-for-capsnet-hyperparameter-tuning-with-keras-3127690f7f28
  60. GPyTorch: A Python Library For Gaussian Process Models https://analyticsindiamag.com/guide-to-gpytorch-a-python-library-for-gaussian-process-models/
  61. Auto-PyTorch https://github.com/automl/Auto-PyTorch
  62. l.SMAC https://www.automl.org/automated-algorithm-design/algorithm-configuration/smac/ https://towardsdatascience.com/automl-for-fast-hyperparameters-tuning-with-smac-4d70b1399ce6
  63. m.faster Hyper Parameter Tuning(sklearn-nature-inspired-algorithms) https://pypi.org/project/sklearn-nature-inspired-algorithms/
  64. n.talos Neural network and hyperparameter optimization using Talos https://www.analyticsvidhya.com/blog/2021/05/neural-network-and-hyperparameter-optimization-using-talos/
  65. https://towardsdatascience.com/10-hyperparameter-optimization-frameworks-8bc87bc8b7e3
  66. https://mlwhiz.com/blog/2020/02/22/hyperspark/?utm_campaign=100x-faster-hyperparameter-search-framework-with-pyspark&utm_medium=social_link&utm_source=missinglettr
  67. DeepSpeed is a deep learning optimization library that makes distributed training easy, efficient, and effective https://github.com/microsoft/DeepSpeed
  68. o.shap-hypetune https://github.com/cerlymarco/shap-hypetune https://towardsdatascience.com/shap-for-feature-selection-and-hyperparameter-tuning-a330ec0ea104
  69. mlmachine,Polyaxon,BayesianOptimization,Talos,SHERPA,Scikit-Optimize,GPyOpt
  70. p.Hyperactive https://github.com/SimonBlanke/Hyperactive
  71. Hyperopt, Optuna, and Ray,SCIKIT-OPTIMIZE,SMAC,Multi-fidelity Optimization,Successive Halving,Hyperband BOHB,SMBOSearch
  72. OMLT optimization https://github.com/cog-imperial/OMLT
  73. HyperOpt http://hyperopt.github.io/hyperopt/ Optuna https://optuna.org/ Scikit-optimize https://scikit-optimize.github.io/stable/ SigOpt https://sigopt.com/
  74. DeepHyper Hyperparameter Search for Deep Neural Networks https://github.com/deephyper/deephyper
  75. lipo hyperparameter tuning https://github.com/jdb78/lipo
  76. Weights and Biases to Perform Hyperparameter Optimization https://hackernoon.com/using-weights-and-biases-to-perform-hyperparameter-optimization

Cross validation techniques- https://towardsdatascience.com/understanding-8-types-of-cross-validation-80c935a4976d

a.Exhaustive, where the method learn and test on every single possibility of dividing the dataset into training and testing subsets.
b.Non-exhaustive cross validation methods where all ways of splitting the sample are not computed.

  1. 1.Loocv
  2. 2.Kfoldcv,Repeated K-Folds Method,Shuffle & Split cross-validation
  3. 3.Stratfied cross validation,Stratified K-fold CV,Group K-fold,StratifiedGroupKFold,StratifiedShuffleSplit,Nested K-folds,Random split KFold,Walk forward,Group Time Series,Purged Group KFold,Combinatorial Purged Group KFold
  4. 4.Repeated K-folds,RepeatedStratifiedKFold,Repeated random subsampling CV
  5. 5.Holdout cross-validation
  6. 6.Repeated cross-validation,Repeated K-folds,Blocked Cross-Validation Method, Nested Cross-Validation Method,Group Cross-Validation,GroupShuffleSplit,Blocked Cross-Validation
  7. 7.LeaveOneOut,Leave P out ,Leave-one-out cross-validation,Leave-One-Group-Out Method,Leave-P-Group-Out Method
  8. 8.Time Series cross-validation,Time Series Split cross-validation ,Rolling Cross-Validation,Rolling Time Series Cross Validation,Rolling Window Cross-Validation,Monte Carlo Cross-Validation,Holdout Time Series Cross-Validation,Time Series Cross-Validation with a Gap,Sliding Time Series Cross-Validation,GapKFold,GapLeavePOut,GapRollForward
  9. 9.ShuffleSplit Cross Validation,Group Shuffle Split,Simple Time Split Validation,Sliding Window Validation,Expanding Window Validation
  10. 10.Group KFold Cross Validation
  11. 11.Monte-Carlo Cross Validation,Blocked cross-validation,Blocked K-Fold Cross-Validation,Modified K-Fold Cross-Validation

Tensorboard,Neptune,TensorFlow Profiler to visualization of model performance

Distributed Training with TensorFlow

6.Testing model

Text Robustness Evaluation Platform https://github.com/textflint/textflint

Generally used metrics

  1. Always check bias variance tradeoff to know how model is performing
  2. Locust Performance Testing ML Serving APIs With Locust https://www.analyticsvidhya.com/blog/2021/06/performance-testing-ml-serving-apis-with-locust/
  3. Model can be overfitting(low bias,high variance),underfitting(high bias,high variance),good fit(low bias,low variance)
  4. https://scikit-learn.org/stable/modules/model_evaluation.html https://scikit-learn.org/stable/modules/classes.html#module-sklearn.linear_model
  5. https://stanford.edu/~shervine/teaching/cs-229/cheatsheet-machine-learning-tips-and-tricks
  6. KS test to evaluate the separation between class distribution
  7. Evaluating the potential return of a model with Lift, Gain, and Decile Analysis
  8. 1.Regression task - mean-squared-error, Root-Mean-Squared-Error,mean-absolute error, R², Adjusted R²,Cross-entropy loss,Mean percentage error
  9. 2.Classification task-Accuracy,confusion matrix,Precision,Recall,F1 Score,Binary Crossentropy,Categorical Crossentropy,AUC-ROC curve,AUPRC,log loss,Average precision,Mean average precision
  10. 3.Reinforcement learning - generally use rewards
  11. 4.Incase of machine translation use bleu score
  12. 5.Clustering then use External: Adjusted Rand index, Jaccard Score, Purity Score,Rand Index,Mutual Information,V-measure,Fowlkes-Mallows Scores,DBCV
  13. Internal:silhouette_score, Davies-Bouldin Index, Dunn Index
  14. autoelbow,elbow,Davies-Bouldin Index,Calinski-Harabasz Index
  15. https://towardsdatascience.com/performance-metrics-in-machine-learning-part-3-clustering-d69550662dc6
  16. 6.Object Detection loss-localization loss,classification loss,Focal Loss,IOU,L2 loss
  17. 7.Distance Metrics - Euclidean Distance,Manhattan Distance,Minkowski Distance,Hamming Distance https://towardsdatascience.com/9-distance-measures-in-data-science-918109d069fa
  18. Dimensionality Reduction Metrics - Cumulative Explained Variance,Trustworthiness,Sammons Mapping
  19. 8.Recommender Systems https://parthchokhra.medium.com/evaluating-recommender-systems-590a7b87afa5
  20. Accuracy Metrics (RMSE, MAE),Top-N Hit Rate
  21. RecList: The better way to evaluate recommender systems
  22. Similarity metrics : Cosine similarity,Jaccard similarity,Euclidean distance Predictive metrics: MAE,RMSE
  23. metric-Built-in metrics, Custom metric without external parameters,Custom metric with external parameters,Subclassing custom metric layer
  24. Robustness Gym: Evaluation Toolkit for NLP https://github.com/robustness-gym/robustness-gym
  25. https://medium.com/swlh/custom-loss-and-custom-metrics-using-keras-sequential-model-api-d5bcd3a4ff28
  26. loss-Built-in loss, Custom loss without external parameters,Custom loss with external parameters,Subclassing loss layer
  27. https://analyticsindiamag.com/all-pytorch-loss-function/ https://analyticsindiamag.com/ultimate-guide-to-loss-functions-in-tensorflow-keras-api-with-python-implementation/
  28. tensorwatch Debugging, monitoring and visualization for Python Machine Learning and Data Science https://github.com/microsoft/tensorwatch
  29. Types of Data Drift : Concept drift,Virtual drift,Covariate shift,Prior probability shift,Annotator drift,Data poisoning
  30. mitigate the effects of data drift: Regular retraining,Data preprocessing,Data augmentation,Monitoring,Online learning,Domain adaptation,Annotator and data quality control
  31. Methods to Detect Drift A) Statistical Approaches,Page-Hinkley method,Kolmogorov-Smirnov Test,Population Stability Index (PSI),Kullback-Leibler (KL) divergence,Jensen-Shannon divergence, Wasserstein Distance B) Model-Based Approach C) Adaptive Sliding Window d)Data visualization tools e)Model performance monitoring f)Drift detection libraries
  32. 𝐭𝐨𝐨𝐥𝐬 𝐭𝐨 𝐝𝐞𝐭𝐞𝐜𝐭 𝐦𝐨𝐝𝐞𝐥 𝐝𝐫𝐢𝐟𝐭𝐬 : 𝐰𝐡𝐲𝐥𝐨𝐠𝐬,𝐄𝐯𝐢𝐝𝐞𝐧𝐭𝐥𝐲,𝐀𝐥𝐢𝐛𝐢 𝐃𝐞𝐭𝐞𝐜𝐭
  33. Steps to take when there is an occurrence of drift Check Data Quality, Investigate,Retrain the model,Rebuild the model, Pause the model and Fallback
  34. Ways to handle Drift in Production a) Rapidly adapt to concept drift b) Be resistant to noise while distinguishing it from concept drift c) Notice and handle severe drift in model performance.
  35. article link https://medium.com/@dummahajan/combating-data-drift-the-fight-for-model-accuracy-2c619ee1e33a

Docker and Kubernetes

https://towardsdatascience.com/deploy-machine-learning-app-built-using-streamlit-and-pycaret-on-google-kubernetes-engine-fd7e393d99cb

simplest way to serve your ML models on Kubernetes https://towardsdatascience.com/the-simplest-way-to-serve-your-ml-models-on-kubernetes-5323a380bf9f

7.deployment https://github.com/piyushpathak03/Model-Deployment

  1. Train: one off, batch and real-time/online training
  2. Serve: Batch, Realtime (Database Trigger, Pub/Sub, web-service, inApp)
  3. Continuously Monitor the Behaviour of Deployed Models https://se-ml.github.io/best_practices/04-monitor_models_prod/
  4. Model Monitoring https://www.kdnuggets.com/2021/03/machine-learning-model-monitoring-checklist.html
  5. Automate Model Deployment https://se-ml.github.io/best_practices/04-auto_model_packaging/
  6. Platform as a Service (PaaS),Infrastructure as a Service (IaaS),SaaS (Software as a Service)
  7. 3 main approaches of Saving and Reloading an ML Model-Pickle Approach,Joblib Approach,JSON approach
  8. https://www.datacamp.com/community/tutorials/pickle-python-tutorial https://github.com/balavenkatesh3322/model_deployment
  9. 1.Azure
  10. 2.Heroku
  11. 3.Amazon Web Services Everything AWS https://app.polymersearch.com/discover/aws
  12. 4.Google cloud platform
  13. 5.ngrok https://www.youtube.com/watch?v=AkEnjJ5yWV0
  14. Deploy a Machine Learning Model for Free https://www.freecodecamp.org/news/deploy-your-machine-learning-models-for-free/
  15. mlpack is a fast, flexible machine learning library suitable for both data science prototyping and deployment https://numfocus.org/project/mlpack https://github.com/mlpack/mlpack
  16. MODEL DEPLOYMENT USING TF SERVING
  17. Dockerize https://www.kdnuggets.com/2021/04/dockerize-any-machine-learning-application.html https://pub.towardsai.net/how-to-dockerize-your-data-science-project-a-quick-guide-b6fa2d6a8ba1
  18. bodywork-core MLOps tool for deploying machine learning projects to Kubernetes https://github.com/bodywork-ml/bodywork-core
  19. Create ML model inside the docker container https://dev.to/niteshthapliyal/create-ml-model-inside-the-docker-container-3b23
  20. LyftLearn: ML Model Training Infrastructure built on Kubernetes https://eng.lyft.com/lyftlearn-ml-model-training-infrastructure-built-on-kubernetes-aef8218842bb
  21. Model Serving https://neptune.ai/blog/ml-model-serving-best-tools?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-ml-model-serving-best-tools
  22. TensorFlow Extended (TFX) is an end-to-end platform for deploying production ML pipelines https://www.tensorflow.org/tfx https://theaisummer.com/tfx/?utm_content=163294295&utm_medium=social&utm_source=linkedin&hss_channel=lcp-42461735
  23. torchblaze https://github.com/MLH-Fellowship/torchblaze https://mlh-fellowship.github.io/torchblaze/
  24. ML Aide Manage Machine Learning Lifecycle https://mlaide.com/home https://medium.com/ml-aide/manage-machine-learning-lifecycle-with-ml-aide-dfe7710cbe53
  25. Models visualization using Tensorboard,netron, TensorBoard.dev
  26. Python web Frameworks for App Development- Flask,Streamlit,fastapi,fastDeploy,Django,Web2py,Pyramid,CherryPy,Voila,Kivy and Kivymd
  27. streamlit,Gradio,mia,opyrator,plotly jupyterdash,h2o wave,dash,gradio,PyWebIO,r shiny,sanic,panel,flask,django,PySimpleGUI,pywebio,autocalc,Mercury,Chitra ,Bokeh,Panel,jupyter Voila with ipywidgets,Panel,dash,Fast Dash,BentoML,Cortex,Seldon,UnionML,Taipy,fastDeploy,Mlflow,Seldon core,tensorflow serving,kserve,torchserve,ray,clearml,mlrun,pymlpipe,FastDeploy,Shiny,Voila,Cog,BentoML,MLflow,PyMLpipe,truss,playtorch,Streamsync,panel,Databutton,plotly,pyscript, Sanic,skops,Mage,sematic,Cog, BentoML,Truss,bentoctl,Banana,Pyramid,Docker,Kubernetes,SageMaker,TensorFlow Serving,Kubeflow,Cortex,Seldon.io,Cortex,TensorFlow Serving,TorchServe,KFServing,Multi Model Server,Triton Inference Server,ForestFlow,Seldon Core,BudgetML,GraphPipe,Hydrosphere,MLEM,Opyrator,Apache PredictionIO,Cortex,Triton Inference Server,ForestFlow,DeepDetect,Seldon Core,Kubeflow,datapane,Pynecone.io,Anvil,h2oai nitro,rest-model-service,Databutton,CherryPy,Anvil,modelbit,Pynecone,modelbit,wagtail,flet,Chainlit,Solara
  28. Django models https://www.deploymachinelearning.com/#create-django-models https://www.deploymachinelearning.com/
  29. BentoML https://github.com/bentoml/BentoML
  30. UnionML: the easiest way to build and deploy machine learning microservices https://github.com/unionai-oss/unionml
  31. panel high-level app and dashboarding solution for Python https://github.com/holoviz/panel
  32. sanic https://github.com/sanic-org/sanic
  33. Gradio - take input from user https://gradio.app/getting_started
  34. Fast Dash https://fastdash.app/
  35. binder - https://mybinder.org/
  36. Netlify https://www.analyticsvidhya.com/blog/2021/04/easily-deploy-your-machine-learning-model-into-a-web-app-netlify/
  37. streamlit https://www.kdnuggets.com/2019/10/write-web-apps-using-simple-python-data-scientists.html https://www.youtube.com/watch?v=iUgNIFrVejc https://blog.streamlit.io/introducing-theming/
  38. Streamlit Flask App from Colab using remoteit and ngrok https://www.youtube.com/watch?v=O2enoygZwl4
  39. Streamlit to databases https://docs.streamlit.io/en/0.83.0/tutorial/databases.html
  40. https://github.com/jrieke/best-of-streamlit
  41. https://neptune.ai/blog/streamlit-guide-machine-learning?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-streamlit-guide-machine-learning
  42. streamlit-ace https://github.com/okld/streamlit-ace https://www.youtube.com/watch?v=Iv2vt-7AYNQ
  43. customize the themes of your Streamlit web apps https://www.youtube.com/watch?v=3xJYP_C4KNE https://github.com/khuyentran1401/Data-science/tree/master/applications/pywebio_examples
  44. colab_everything Python library to run streamlit, flask, fastapi, etc on google colab https://github.com/Ankur-singh/colab_everything/
  45. dash https://github.com/plotly/dash
  46. panel-highcharts https://awesome-panel.org/ https://github.com/marcskovmadsen/panel-highcharts https://github.com/holoviz/panel https://github.com/holoviz/panel
  47. opyrator Turns your machine learning code into microservices with web API, interactive GUI, and more https://github.com/ml-tooling/opyrator
  48. plotly https://plotly.com/ https://analyticsindiamag.com/how-to-use-plotly-in-colab/
  49. Creating a Machine Learning App with Power BI and PyCaret
  50. Streamlit vs. Dash vs. Shiny vs. Voila vs. Flask vs. Jupyter vs django vs PySimpleGUIvs pywebio vs Gradio vs autocalc vs Mercury vs Chitra https://www.datarevenue.com/en-blog/data-dashboarding-streamlit-vs-dash-vs-shiny-vs-voila,pymlpipe,Lightning Apps,Aibro
  51. Mercury: easily convert Python notebook to web app and share with others https://github.com/mljar/mercury
  52. autocalc https://github.com/kefirbandi/autocalc https://towardsdatascience.com/creating-a-ui-with-ipywidgets-and-autocalc-2ef8ea4cc6c2
  53. Quickly deploy ML WebApps https://ngrok.com/
  54. Chitra https://github.com/gradsflow/chitra
  55. Deepnote https://deepnote.com/ https://www.youtube.com/watch?v=0ppptVxgEI8
  56. booklet https://booklet.ai/ https://towardsdatascience.com/building-a-covid-19-project-recommendation-system-4607806923b9
  57. https://analyticsindiamag.com/top-8-python-tools-for-app-development/
  58. Voila This library can turn your Jupyter notebooks into standalone web apps that can be deployed to any cloud platform. https://voila.readthedocs.io/en/stable/
  59. H2O.ai https://www.h2o.ai/blog/data-to-production-ready-models-to-business-apps-in-just-a-few-steps/
  60. PyQt and Tkinter , PySimpleGUI are GUI programming in Python https://github.com/tirthajyoti/DS-with-PySimpleGUI
  61. DearPyGui https://github.com/hoffstadt/DearPyGui
  62. PySimpleGUI https://github.com/PySimpleGUI/PySimpleGUI
  63. Gooey Turn (almost) any Python command line program into a full GUI application with one line https://github.com/chriskiehl/Gooey
  64. snapyml Deploy AI Models For Free -http://snapyml.snapy.ai/
  65. BentoML https://github.com/bentoml/BentoML
  66. h20 wave-apps https://github.com/h2oai/wave-apps https://h2oai.github.io/wave/docs/installation/ https://h2oai.github.io/wave/
  67. h20 Wave ML (AutoML for Wave Apps) https://h2oai.github.io/wave/blog/ml-release-0.3.0/
  68. fastapi https://towardsdatascience.com/deploying-ml-models-in-production-with-fastapi-and-celery-7063e539a5db
  69. FastAPI + Uvicorn https://www.kdnuggets.com/2021/04/deploy-machine-learning-models-to-web.html
  70. FastAPI based template https://github.com/99sbr/fastapi-template fastapi-log 0.0.3 https://pypi.org/project/fastapi-log/
  71. testing FastAPI ML APIs with Locust https://locust.io/ https://rubikscode.net/2022/03/21/performance-testing-fastapi-ml-apis-with-locust/
  72. chitra 𝗖𝗿𝗲𝗮𝘁𝗲 𝗔𝗣𝗜 𝗳𝗼𝗿 𝗔𝗻𝘆 𝗟𝗲𝗮𝗿𝗻𝗶𝗻𝗴 𝗠𝗼𝗱𝗲𝗹 https://github.com/aniketmaurya/chitra
  73. PyWebIO Write Interactive Web App in Script Way Using Python https://www.youtube.com/watch?v=vp1ZNapAy6Y https://towardsdatascience.com/pywebio-write-interactive-web-app-in-script-way-using-python-14f50155af4e https://github.com/tirthajyoti/PyWebIO
  74. aibro Deploy Machine Learning Models to the Cloud Quickly and Easily https://aipaca.ai/?ref=hackernoon.com https://medium.datadriveninvestor.com/how-to-deploy-machine-learning-models-to-the-cloud-quickly-and-easily-41cca9425c75
  75. Katana https://github.com/shaz13/katana https://katana-demo.herokuapp.com/redoc https://katana-demo.herokuapp.com/docs
  76. DS-with-PySimpleGUI https://github.com/tirthajyoti/DS-with-PySimpleGUI
  77. pywinauto Windows GUI Automation with Python
  78. tkinter to deploy machine learning model-https://analyticsindiamag.com/complete-tutorial-on-tkinter-to-deploy-machine-learning-model/
  79. Tkinter-Designer Create Beautiful Tkinter GUIs by Drag and Drop https://github.com/ParthJadhav/Tkinter-Designer
  80. Web-Based GUI (Gradio)- https://analyticsindiamag.com/guide-to-gradio-create-web-based-gui-applications-for-machine-learning/ https://www.gradio.app/
  81. Bamboolib https://medium.com/ai-in-plain-english/bamboolib-a-data-warriors-weapon-9f734f4c2553
  82. web application(dash)- https://dash.plotly.com/
  83. Pyramid web framework https://trypyramid.com/documentation.html
  84. Kivy /Kivymd creating an android app
  85. https://towardsdatascience.com/pycaret-2-1-is-here-whats-new-4aae6a7f636a
  86. Create a Website with AI https://www.bookmark.com/
  87. localhost to globalurl https://ngrok.com/ https://remote.it/
  88. Jupyter Notebook into an interactive dashboard (voila)-https://voila.readthedocs.io/en/stable/
  89. high-level app and dashboarding solution(Panel)-https://panel.holoviz.org/
  90. MaaS Build ML Models As A Service https://www.analyticsvidhya.com/blog/2021/05/maas-build-ml-models-as-a-service/
  91. https://github.com/gradio-app/gradio

Tensorflow lite:Use of tensorflow lite to reduce size of model https://www.tensorflow.org/lite https://codelabs.developers.google.com/codelabs/recognize-flowers-with-tensorflow-on-android-beta/#0 https://tfhub.dev/s?deployment-format=lite https://www.tensorflow.org/lite/examples https://www.tensorflow.org/lite/microcontrollers
https://www.tensorflow.org/lite/models

Adventures-in-TensorFlow-Lite https://github.com/sayakpaul/Adventures-in-TensorFlow-Lite

coral https://coral.ai/docs/edgetpu/models-intro/

TF Micro and SensiML https://blog.tensorflow.org/2021/05/building-tinyml-application-with-tf-micro-and-sensiml.html

six different types of methods:
1) Pruning, Weight sharing
Structured Pruning,Unstructured Pruning,Pruning Local,Global Pruning
Pruning criteria( Weight magnitude criterion,Gradient magnitude pruning,Global or local pruning,
Model Pruning: Remove irrelevant edges and nodes from a network. Three popular types of pruning:
Zero pruning
Activation pruning
Redundancy pruning

3) Quantization ,TensorFlow Quantum, Int8 quantization
Post-Training Quantization
— Reduce Float16
— Hybrid Quantization
— Integer Quantization
-dynamic range quantization

  • Dynamic/Runtime Quantization
  • Post-Training Static Quantization
  • Static Quantization-aware Training (QAT)
  1. During-Training Quantization
  2. Post-Training Pruning
  3. Post-Training Clustering

4) Knowledge distillation
5) Parameter sharing
6) Tensor decomposition
7) Linear Transformer,Winograd Transformation
8) Selective attention
9) Low-rank factorisation
10) 3LC https://research.google/pubs/pub47962/
11) brevitas https://github.com/Xilinx/brevitas/
12) aimet https://github.com/quic/aimet

Structured pruning,Unstructured/semi-structured pruning,Quantization,Distillation,Post Training,Training-Aware,Sparse Transfer

AIMET is a library that provides advanced quantization and compression techniques for trained neural network models. https://github.com/quic/aimet

Pruning,Nonstructural pruning,Structural pruning,Quantisation-Aware Training,Post-Training Quantisation

QKeras: a quantization deep learning library for Tensorflow Keras

Model Compression https://github.com/open-mmlab/mmrazor

Knowledge Distillation knowledge are categorized into three different types: Response-based knowledge, Feature-based knowledge, and Relation-based knowledge
three principal types of methods for training student and teacher models, namely offline, online and self distillation.

Distillation library KD_Lib https://github.com/SforAiDl/KD_Lib

ibm new tool https://www.zdnet.com/article/ibms-new-tool-lets-developers-add-quantum-computing-power-to-machine-learning/

qiskit-machine-learning https://github.com/Qiskit/qiskit-machine-learning https://qiskit.org/documentation/machine-learning/stubs/qiskit_machine_learning.neural_networks.SamplingNeuralNetwork.html

compressors https://github.com/elephantmipt/compressors

poniard scikit-learn model comparison https://github.com/rxavier/poniard

https://rachitsingh.com/deep-learning-model-compression/#quantization

model optimization (architecture)

TF Lite with iOS, Swift and TF Lite Swift

TinyML https://blog.tensorflow.org/2020/08/the-future-of-ml-tiny-and-bright.html

tinyml-papers-and-projects This is a list of interesting papers and projects about TinyML https://github.com/gigwegbe/tinyml-papers-and-projects

pennylane Python library for differentiable programming of quantum computers https://github.com/PennyLaneAI/pennylane

AI Engine for Edge Devices https://github.com/johnolafenwa/deepstack TensorFlow Lite Samples on Unity https://github.com/asus4/tf-lite-unity-sample

tflite-support TFLite Support is a toolkit that helps users to develop ML and deploy TFLite models onto mobile / ioT devices https://github.com/tensorflow/tflite-support

Post-training Quantization in TensorFlow Lite https://www.tensorflow.org/lite/performance/post_training_quantization

pruning

Custom Text Classification on Android using TensorFlow Lite https://www.analyticsvidhya.com/blog/2021/05/custom-text-classification-on-android-using-tensorflow-lite/

aimet advanced quantization and compression techniques for trained neural network models https://github.com/quic/aimet https://github.com/quic/aimet-model-zoo

Automatic Model Compression (AutoMC) framework for developing smaller and faster AI applications https://github.com/Tencent/PocketFlow

leverage of model architecture

Federated Learning https://www.analyticsvidhya.com/blog/2021/04/federated-learning-for-beginners/ https://www.tensorflow.org/federated

FEDERATED LEARNING(Centralized, Decentralized, Heterogeneous) https://blog.openmined.org/federated-learning-types/ https://aman.ai/primers/ai/federated-learning/

Federated Learning with FEDn https://github.com/scaleoutsystems/fedn

plato scalable federated learning research framework https://github.com/TL-System/plato

FedNLP: A Research Platform for Federated Learning in Natural Language Processing https://github.com/FedML-AI/FedNLP

privacy https://github.com/tensorflow/privacy

Differential Privacy https://aman.ai/primers/ai/differential-privacy/

Quantization:Use Quantization to reduce size of model https://medium.com/qiskit/introducing-qiskit-machine-learning-5f06b6597526

Post Training Quantization
Aware Training Quantization

TensorFlow Quantum https://www.tensorflow.org/quantum

Qiskit Machine Learning https://github.com/Qiskit/qiskit-machine-learning

Quantum Machine Learning

Quantum Kernels https://github.com/Qiskit/qiskit-machine-learning/blob/master/docs/tutorials/03_quantum_kernel.ipynb

IBMs Qiskit,Google’s Cirq,Amazon’s AWS Braket,Microsoft’s Q# and Azure Quantum,Rigetti’s Forest,Xanadu’s Pennylane

On-Device Machine Learning https://developers.google.com/learn/topics/on-device-ml https://www.tensorflow.org/lite/guide/model_maker

Core ML for iOS, Tensorflow lite for Android, ML.NET for Windows and ML Kit

8.Mointoring model

CI CD pipeline used- circleci , jenkins

In real world project use pipeline -https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html

1.easy debugging

2.better readability

Types of Data Drift

Concept drift,Virtual drift,Covariate shift,Prior probability shift,Annotator drift,Data poisoning

There are several measures you can take to mitigate the effects of data drift:

Regular retraining,Data preprocessing,Data augmentation,Monitoring,Online learning,Domain adaptation,Annotator and data quality control

Techniques for Detecting Data Drift

There are several techniques currently available for detecting data drift in machine learning:

Data visualization tools,Drift detection methods,Data quality control techniques,Drift detection libraries,Auto-ML tools

BIG DATA: hadoop,apache spark

project structure

data science project structure https://towardsdatascience.com/automate-your-data-science-project-structure-in-three-easy-steps-277c92328d24

research paper-https://arxiv.org/ ,https://arxiv.org/list/cs.LG/recent, https://www.kaggle.com/Cornell-University/arxiv

arXiv.org https://arxiv.org/list/cs.AI/recent https://arxiv.org/list/stat.ML/recent https://arxiv.org/list/cs.CL/recent https://arxiv.org/list/cs.CV/recent

https://github.com/amitness/papers-with-video

Datasets on arXiv https://medium.com/paperswithcode/datasets-on-arxiv-1a5a8f7bd104

code for research paper https://www.analyticsvidhya.com/blog/2021/06/steal-the-code-ethically-get-better-at-ml-ai-research/

papertalk https://papertalk.org/index

connected papers https://www.connectedpapers.com/

Stanford AI Lab Papers and Talks at ICLR 2021 https://ramseyelbasheer.io/2021/05/03/stanford-ai-lab-papers-and-talks-at-iclr-2021/

Semantic Scholar searches: https://www.semanticscholar.org/search?q=%22neural%20networks%22&sort=relevance&ae=false

https://www.semanticscholar.org/search?q=%22machine%20learning%22&sort=relevance&ae=false

https://www.semanticscholar.org/search?q=%22natural%20language%22&sort=relevance&ae=false

https://www.semanticscholar.org/search?q=%22computer%20vision%22&sort=relevance&ae=false

https://www.semanticscholar.org/search?q=%22deep%20learning%22&sort=relevance&ae=false

code for Research Papers-https://chrome.google.com/webstore/detail/find-code-for-research-pa/aikkeehnlfpamidigaffhfmgbkdeheil

Summarise Research Papers - https://www.semanticscholar.org/

Structure Your Data Science Projects https://towardsdatascience.com/structure-your-data-science-projects-6c6c8653c16a

programming language for data science is Python,R,Julia,Java,Scala,JAVA SCRIPT(Tensorflow.js),etc…

IDE:jupyter notebook,spyder,pycharm,visual studio

4 Tools for Reproducible Jupyter Notebooks https://towardsdatascience.com/4-tools-for-reproducible-jupyter-notebooks-d7423721bd04

12 Jupyter Notebook Extensions That Will Make Your Life Easier https://towardsdatascience.com/12-jupyter-notebook-extensions-that-will-make-your-life-easier-e0aae0bd181

Coding Tools Powered by AI : GitHub Co-Pilot,Tabnine,AI2SQL,Mutable,MarsXm,Ghostwriter,Stenography,OpenAI Codex,CodeT5,Polycoder,GhostWriter Replit,Seek,AI2SQL,Cody by Sourcegraph,MutableAI,StableCode,DeciCoder,santacoder,Code Llama,Amazon CodeWhisperer,Bagasura

BEST ONLINE COURSES

  1. 1.COURSERA
  2. 2.UDEMY
  3. 3.EDX
  4. 4.DATACAMP
  5. 5.Udacity
  6. 6.https://www.skillbasics.com/

BEST YOUTUBE CHANNEL TO FOLLOW

  1. 1.Krish Naik-https://www.youtube.com/user/krishnaik06
  2. 2.Codebasics-https://www.youtube.com/channel/UCh9nVJoWXmFb7sLApWGcLPQ
  3. 3.Abhishek thakur-https://www.youtube.com/user/abhisheksvnit
  4. 4.AIEngineering-https://www.youtube.com/channel/UCwBs8TLOogwyGd0GxHCp-Dw
  5. 5.Ineuron-https://www.youtube.com/channel/UCb1GdqUqArXMQ3RS86lqqOw
  6. 6.Ken jee-https://www.youtube.com/c/KenJee1/featured
  7. 7.3Blue1Brown-https://www.youtube.com/c/3blue1brown/featured
  8. 8.The AI Guy -https://www.youtube.com/channel/UCrydcKaojc44XnuXrfhlV8Q
  9. 9.Unfold Data Science-https://www.youtube.com/channel/UCh8IuVJvRdporrHi-I9H7Vw etc...

BEST BLOGS TO FOLLOW

  1. https://www.cybrhome.com/topic/data-science-blogs
  2. AI Summary https://ai-summary.com/
  3. https://www.datasciencecentral.com/profiles/blog/list https://developer.nvidia.com/blog/?ncid=em-prom-48627
  4. 1.Towards data science-https://towardsdatascience.com/
  5. 2.Analyticsvidhya-https://www.analyticsvidhya.com/blog/?utm_source=feed&utm_medium=navbar https://analyticsindiamag.com/ https://www.analyticsinsight.net/
  6. 3.Medium-https://medium.com/
  7. 4.Machinelearningmastery-https://machinelearningmastery.com/blog/
  8. 5.ML+ -https://www.machinelearningplus.com/
  9. 6.analyticsinsight https://www.analyticsinsight.net/category/latest-news/ https://www.analyticsinsight.net/
  10. 7.KDnuggets https://www.kdnuggets.com/ https://www.kdnuggets.com/news/index.html
  11. 8.Artificial Intelligence Database https://www.wired.com/category/artificial-intelligence/?verso=true
  12. https://machinelearningknowledge.ai/
  13. https://github.com/rushter/data-science-blogs
  14. https://www.datamuni.com/
  15. https://blog.ml.cmu.edu/?utm_source=towardsai.net&utm_medium=referral&utm_campaign=marketing&utm_term=machine-learning-blog&utm_content=best-machine-learning-blogs-to-follow
  16. https://www.amazon.science/blog?utm_source=towardsai.net&utm_medium=referral&utm_campaign=marketing&utm_term=machine+learning+blog&utm_content=machine+learning+blog&f0=0000016e-2ff1-d205-a5ef-aff9651e0000&s=0
  17. https://distill.pub/?utm_source=towardsai.net&utm_medium=referral&utm_campaign=marketing&utm_term=machine-learning-blog&utm_content=best-machine-learning-blogs-to-follow
  18. https://ai.googleblog.com/search/label/Machine%20Learning?utm_source=towardsai.net&utm_medium=referral&utm_campaign=marketing&utm_term=machine-learning-blog&utm_content=best-machine-learning-blogs-to-follow
  19. https://neptune.ai/blog?utm_source=towardsai.net&utm_medium=referral&utm_campaign=marketing&utm_term=machine+learning+blog&utm_content=machine+learning+blog
  20. https://bair.berkeley.edu/blog/?utm_source=towardsai.net&utm_medium=referral&utm_campaign=marketing&utm_term=machine-learning-blog&utm_content=best-machine-learning-blogs-to-follow
  21. https://deepmind.com/research?utm_source=towardsai.net&utm_medium=referral&utm_campaign=marketing&utm_term=machine-learning-blog&utm_content=machine-learning-blogs-to-follow&filters=%7B%22category%22:%5B%22Research%22%5D%7D
  22. https://ai.facebook.com/blog/?utm_source=towardsai.net&utm_medium=referral&utm_campaign=marketing&utm_term=machine-learning-blog&utm_content=machine-learning-blogs-to-follow
  23. https://becominghuman.ai/top-25-ai-and-machine-learning-blogs-for-data-scientists-9f121bcfd9a2
  24. https://medium.com/towards-artificial-intelligence/best-machine-learning-blogs-to-follow-ml-research-ai-3994e01967f9

BEST RESOURCES

https://amitness.com/toolbox/ https://khuyentran1401.github.io/Data-science/ https://github.com/ml-tooling/best-of-ml-python

https://github.com/ml-tooling/best-of-ml-python#machine-learning-frameworks http://dfkoz.com/ai-data-landscape/ https://landscape.lfai.foundation/

https://towardsdatascience.com/data-science-tools-f16ecd91c95d https://mathdatasimplified.com/ https://github.com/neomatrix369/awesome-ai-ml-dl

https://amitness.com/ https://postsyoumighthavemissed.com/search/

1.paperswithcode-https://paperswithcode.com/methods https://www.paperswithcode.com/datasets

paperswithcode-client https://github.com/paperswithcode/paperswithcode-client https://paperswithcode.com/lib/torchvision

https://www.connectedpapers.com/main/4f2eda8077dc7a69bb2b4e0a1a086cf054adb3f9/EfficientNet-Rethinking-Model-Scaling-for-Convolutional-Neural-Networks/graph

2.madewithml-https://madewithml.com/topics/ https://madewithml.com/courses/applied-ml-in-production/ https://github.com/GokuMohandas/applied-ml

modelzoo https://modelzoo.co/

Weights & Biases- https://wandb.ai/gallery sotabench-https://sotabench.com/

3.Deep learning-https://course.fullstackdeeplearning.com/#course-content

4.pytorch deep learning-https://atcold.github.io/pytorch-Deep-Learning/

PYTORCH HUB https://pytorch.org/hub/ https://pytorch.org/hub/research-models

5.https://papers.labml.ai/papers/daily https://42papers.com/

https://www.kdnuggets.com/2019/08/pytorch-cheat-sheet-beginners.html https://www.kdnuggets.com/2019/04/nlp-pytorch.html https://www.kdnuggets.com/2019/08/9-tips-training-lightning-fast-neural-networks-pytorch.html

fairscale PyTorch extensions for high performance and large scale training https://github.com/facebookresearch/fairscale

PyTorch Lightning-https://github.com/PyTorchLightning/pytorch-lightning https://www.kdnuggets.com/2020/11/deploy-pytorch-lightning-models-production.html

https://pytorch-lightning.medium.com/lightning-flash-0-3-new-tasks-visualization-tools-data-pipeline-and-flash-registry-api-1e236ba9530

PYTORCH - https://pytorch.org/ https://pytorch.org/ecosystem/ https://pytorch.org/tutorials/ https://pytorch.org/docs/stable/index.html https://github.com/pytorch/pytorch

PYTORCH Lightning https://pytorchlightning.ai/community#projects https://seannaren.medium.com/introducing-pytorch-lightning-sharded-train-sota-models-with-half-the-memory-7bcc8b4484f2

ort Accelerate PyTorch models with ONNX Runtime https://github.com/pytorch/ort

lightning-flash https://github.com/PyTorchLightning/lightning-flash https://pytorch-lightning.medium.com/introducing-lightning-flash-the-fastest-way-to-get-started-with-deep-learning-202f196b3b98

torchflare easy-to-use PyTorch Framework https://github.com/Atharva-Phatak/torchflare

Lightning Bolts collection of well established, SOTA models and components https://github.com/PyTorchLightning/lightning-bolts

Sharded: A New Technique To Double The Size Of PyTorch Models https://towardsdatascience.com/sharded-a-new-technique-to-double-the-size-of-pytorch-models-3af057466dba

𝗢𝗽𝗮𝗰𝘂𝘀 (𝘁𝗿𝗮𝗶𝗻𝗶𝗻𝗴 𝗣𝘆𝗧𝗼𝗿𝗰𝗵 𝗺𝗼𝗱𝗲𝗹𝘀 𝘄𝗶𝘁𝗵 𝗱𝗶𝗳𝗳𝗲𝗿𝗲𝗻𝘁𝗶𝗮𝗹 𝗽𝗿𝗶𝘃𝗮𝗰𝘆)-https://opacus.ai/

light-face-detection https://github.com/borhanMorphy/light-face-detection

DALLE-pytorch https://github.com/lucidrains/DALLE-pytorch

PyTorch JIT -https://lernapparat.de/jit-optimization-intro/

jax- https://github.com/google/jax

incubator-mxnet - https://github.com/apache/incubator-mxnet

ignite-https://github.com/pytorch/ignite

fastText - https://github.com/facebookresearch/fastText

rapidminer-https://rapidminer.com/

5.deep-learning-drizzle-https://deep-learning-drizzle.github.io/ https://deep-learning-drizzle.github.io/index.html

6.Fastaibook-https://github.com/fastai/fastbook , https://course.fast.ai/ https://www.fast.ai/2019/07/08/fastai-nlp/ https://www.fast.ai/2020/08/21/fastai2-launch/

neptune.ai-https://docs.neptune.ai/index.html

Dive into Deep Learning http://d2l.ai/

7.TopDeepLearning-https://github.com/aymericdamien/TopDeepLearning

8.NLP-progress-https://github.com/sebastianruder/NLP-progress

9.EasyOCR,textract,pytesseract,tesserocr,Amazon textract,TabulaPy, pyzbar,pyocr,OCR With Detectron2,PymuPDF,Camelot,keras ocr,Keras CRNN,PDFTableExtract(by PyPDF2),tesseract-ocr,PyMuPDF,pyocr,Apache Tika,pdfPlumber,PDFMiner,PyPDF2,pdfMiner3,pdf2image,pdfquery,TextOCR,keras-CTPN,pytorch-CTPN,ocr.pytorch,layout-parser,tabula,Spark OCR,mmocr,Amazon Rekognition ,Amazon Textract,Azure OCR, Google OCR,PaddleOCR,TrOCR,MMOCR,awesome OCR,Paddle OCR,OCRmyPDF,calamari, attention ocr,Mozart,pdftabextract,Doc2Text,OpenCV’s EAST,deepdoctection,EAST text detector,slate3k,textract,CRAFT-pytorch,ocr donut,LOGOS ocr,
ocrpy,docquery,Parsr,DocuQuery,LayoutLM,docTR,docquery,CascadeTabNet,OpenCV,OCRopus,Kraken,OCRmypdf,MMOCR,PPOCR,Keras-OCR,MultiOcr,TrOCR,docTR,surya OCR,Bhashini,OCRopus,Kraken

  1. Processing documents as Text: extract text with PyPDF2, extract tables with Camelot or TabulaPy, extract figures with PyMuPDF.
  2. Converting documents into Image (OCR): conversion with pdf2image, extract data with PyTesseract plus many other supporting libraries, or just LayoutParser.
  3. OCR toolbox from Davar-Lab https://github.com/hikopensource/davar-lab-ocr
  4. To pdf: python-pdfkit,wkhtmltopdf,FPDF

10.Awesome-pytorch-list-https://github.com/bharathgs/Awesome-pytorch-list https://shivanandroy.com/awesome-nlp-resources/

11.free-data-science-books-https://github.com/chaconnewu/free-data-science-books

12.arcgis-https://github.com/Esri/arcgis-python-api https://geemap.org/

13.data-science-ipython-notebooks-https://github.com/donnemartin/data-science-ipython-notebooks

14.julia-https://github.com/JuliaLang/julia , https://docs.julialang.org/en/v1/

15.google-research-https://github.com/google-research/google-research

16.reinforcement-learning-https://github.com/dennybritz/reinforcement-learning

17.keras-applications-https://github.com/keras-team/keras-applications , https://github.com/keras-team/keras https://keras.io/examples/

18.opencv-https://github.com/opencv/opencv

19.transformers-https://github.com/huggingface/transformers

20.code implementations for research papers-https://chrome.google.com/webstore/detail/find-code-for-research-pa/aikkeehnlfpamidigaffhfmgbkdeheil

21.regarding satellite images - Geo AI,Arcgis,geemap

ersi arcgis-https://www.esri.com/en-us/arcgis/about-arcgis/overview

earthcube-https://www.earthcube.eu/

geemap-https://geemap.org/

22.Monk_Object_Detection-https://github.com/Tessellate-Imaging/Monk_Object_Detection

https://github.com/Tessellate-Imaging/monk_v1

https://analyticsindiamag.com/build-computer-vision-applications-with-few-lines-of-code-using-monk-ai/

pyradox https://github.com/Ritvik19/pyradox

23.NLP-progress - https://github.com/sebastianruder/NLP-progress

24.interview-question-data-science-https://github.com/iNeuronai/interview-question-data-science-

27.Tool for visualizing attention in the Transformer model-https://github.com/jessevig/bertviz

28.TransCoder-https://github.com/facebookresearch/TransCoder

29.Tessellate-Imaging-https://github.com/Tessellate-Imaging/monk_v1

Monk_Object_Detection-https://github.com/Tessellate-Imaging/Monk_Object_Detection/tree/master/application_model_zoo

Artificial-Intelligence-Deep-Learning-Machine-Learning-Tutorials- https://github.com/TarrySingh/Artificial-Intelligence-Deep-Learning-Machine-Learning-Tutorials

30.Machine-Learning-with-Python-https://github.com/tirthajyoti/Machine-Learning-with-Python

31.huggingface contain almost all nlp pretrained model and all tasks related to nlp field https://huggingface.co/course/chapter0?fw=pt

https://huggingface.co/models https://www.kdnuggets.com/2021/02/hugging-face-transformer-basics.html https://huggingface.co/modelsz

https://github.com/huggingface https://github.com/huggingface/transformers https://huggingface.co/transformers/ https://huggingface.co/transformers/master/ https://github.com/huggingface/tokenizers

hugging face spaces https://huggingface.co/spaces

Hugging Face pipelines https://towardsdatascience.com/effortless-nlp-using-pre-trained-hugging-face-pipelines-with-just-3-lines-of-code-a4788d95754f

Fine-tuning pretrained NLP models with Huggingface’s Trainer https://towardsdatascience.com/fine-tuning-pretrained-nlp-models-with-huggingfaces-trainer-6326a4456e7b

Mixing Hugging Face Models with Gradio 2.0 https://gradio.app/blog/using-huggingface-models https://huggingface.co/blog/gradio

ktrain https://github.com/amaiya/ktrain

Top 6 Alternatives To Hugging Face https://analyticsindiamag.com/top-6-alternatives-to-hugging-face/

32.multi-task-NLP-https://github.com/hellohaptik/multi-task-NLP

33.gpt-2 - https://github.com/openai/gpt-2

34.Powerful and efficient Computer Vision Annotation Tool (CVAT)-https://github.com/openvinotoolkit/cvat, https://github.com/abreheret/PixelAnnotationTool

https://github.com/UniversalDataTool/universal-data-tool http://www.robots.ox.ac.uk/~vgg/software/via/

36.awesome Data Science-https://github.com/academic/awesome-datascience

39.Super Duper NLP Repo-https://notebooks.quantumstat.com/ https://models.quantumstat.com/ https://miro.com/app/board/o9J_kqndLls=/ https://datasets.quantumstat.com/

https://index.quantumstat.com/

https://notebooks.quantumstat.com/?utm_campaign=NLP%20News&utm_medium=email&utm_source=Revue%20newsletter

40.papers summarizing the advances in the field-https://github.com/eugeneyan/ml-surveys

41.deep-translator-https://github.com/nidhaloff/deep-translator

44.ipython-sql-https://github.com/catherinedevlin/ipython-sql

45.libra-https://github.com/Palashio/libra

46.opencv-https://github.com/opencv/opencv

47.learnopencv-https://github.com/spmallick/learnopencv , https://www.learnopencv.com/

48.math is fun-https://www.mathsisfun.com/ , https://pabloinsente.github.io/intro-linear-algebra, https://hadrienj.github.io/posts/Deep-Learning-Book-Series-Introduction/

49.DEEP LEARNING WITH PYTORCH: A 60 MINUTE BLITZ - https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html

50.https://data-flair.training/blogs/

https://data-flair.training/blogs/python-tutorials-home/ https://data-flair.training/blogs/hadoop-tutorials-home/ https://data-flair.training/blogs/spark-tutorials-home/

https://data-flair.training/blogs/tableau-tutorials-home/ https://data-flair.training/blogs/data-science-tutorials-home/

Spark Release 3.0.1-https://spark.apache.org/releases/spark-release-3-0-1.html https://neptune.ai/blog/apache-spark-tutorial

Koalas on Apache Spark - Pandas API https://www.youtube.com/watch?v=kOtAMiMe1JY&t=482s https://koalas.readthedocs.io/en/latest/

mllib https://spark.apache.org/docs/2.0.0/api/python/pyspark.mllib.html https://spark.apache.org/docs/2.0.0/api/python/index.html

https://data-flair.training/blogs/spark-tutorial/ Spark Core,Spark SQL,Spark Streaming,Spark MLlib,Spark GraphX,etc…

Machine Learning with Optimus on Apache Spark https://www.kdnuggets.com/2017/11/machine-learning-with-optimus.html

BigDL: Distributed Deep Learning Framework for Apache Spark https://github.com/intel-analytics/BigDL

51.for more cheatsheets-https://github.com/FavioVazquez/ds-cheatsheets , https://medium.com/swlh/the-ultimate-cheat-sheet-for-data-scientists-d1e247b6a60c

https://www.theinsaneapp.com/2020/12/machine-learning-and-data-science-cheat-sheets-pdf.html

https://stanford.edu/~shervine/teaching/cs-229/cheatsheet-supervised-learning

52.text2emotion-https://pypi.org/project/text2emotion/

53.ExploriPy-https://analyticsindiamag.com/hands-on-tutorial-on-exploripy-effortless-target-based-eda-tool/

54.TCN-https://github.com/philipperemy/keras-tcn

56.earthengine-py-notebooks-https://github.com/giswqs/earthengine-py-notebooks

58.numerical-linear-algebra -https://github.com/fastai/numerical-linear-algebra

61.chatbot- from scratch,google dialogflow,rasa nlu,azure luis, Azure Bot Service,chatterbot,Amazon lex,Wit.ai,Luis.ai,IBM Watson,Parrot etc…

Chatterbot,Botkit,BotPress,Bottender,IBM Watson,Microsoft bot Framework,Pandorabots,RASA Stack,Pandorabots,BlenderBot3,DeepPavlov,OpenDialogTock,Wit.ai, Pandorabots,Proto AIC,HubSpot Chatbot Builder,Intercom,Zendesk,Freshworks,Botsify,Tidio,Infobip,OpenChat

ChatGPT openai chatboat and search engine,meta ChatLLaMA ,VisualChatGPT,ViperGPT,GPT-4,AutoGPT,babyagi,ChaosGPT,Agentgpt,MiniGPT-4,GPT4 All ,BabyAGI and Auto-GPT,Dolly,Camel,claude2,bing,Code Interpreter,Anthropic’s,WizardCoder

Bard google chatboat and search engine,PALM API,OpenChatKit: Open-Source ChatGPT Alternative

meta LLaMA,LLaMA-v2,Alpaca 7B,h2o-llmstudio,StableLM,HuggingChat

Ernie bot,Baidu chatbot,Claude,Alpaca,ChatGLM,Bloomberg-GPT,Vicuna,StackLLaMA,h2o-llmstudio,Claude 2,Perplexity Ai,FreeWilly1,FreeWilly2,Falcon,Dolly,Guanaco,BloomZ,Alpaca,OpenChatKit,GPT4ALL,Vicuna,Flan-T5,FalconLite ,StableBeluga2,Tongyi Qianwen

no code chatbots https://juji.io/

https://github.com/fendouai/Awesome-Chatbot https://medium.com/nerd-for-tech/make-money-building-a-fast-powerful-chatbot-in-10-minutes-using-nltk-91038e15ab17

https://www.analyticsinsight.net/category/chatbots/ https://www.promaticsindia.com/blog/here-are-the-most-popular-chatbot-development-frameworks/

https://neptune.ai/blog/building-machine-learning-chatbots-platforms-and-applications?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-building-machine-learning-chatbots-platforms-and-applications

https://blog.ubisend.com/optimise-chatbots/chatbot-training-data

OpenChat: Open Source Chatting Framework for Generative Models https://analyticsindiamag.com/a-brief-overview-of-openchat-open-source-chatting-framework-for-generative-models/

  1. No Code Machine Learning / Deep Learning https://analyticsindiamag.com/top-12-no-code-machine-learning-platforms-in-2021/ https://www.pye.ai/2021/06/01/2021-list-of-top-data-science-platforms-end-to-end-machine-learning/

    https://serokell.io/blog/top-no-code-platforms https://www.nanalyze.com/2021/04/no-code-platforms-machine-learning/

    Akkio, Obviously.ai, DataRobot, Levity, Clarifai, Teachable Machines, Lobe,pimer,DynaBench,APAflow,Runway AI,Obviously AI,CreateML,MakeML,Fritz AI,MonkeyLearn,Nanonets,SuperAnnotate,CausaLens,Levity,Clarifai,BigML,Teachable Machine,actable,Bonsai,labelsleuth,Cooka,oracle AutoML,EdgeImpulse,Mantium AI,Sway,Graphite,DataRobot,Graphite Note,Levity,MakeML,MonkeyLearn,Noogata,Obviously.ai,Pecan,RapidMiner,RunwayML,SuperAnnotate,KNIME,DashB.ai,NoCode-ML,BMW-TensorFlow-Training-GUI,Akkio

    Teachable Machine-https://teachablemachine.withgoogle.com/ Vertex AI https://cloud.google.com/vertex-ai/docs/start/automl-users

    Microsoft Lobe -https://lobe.ai/

    Ludwig https://github.com/ludwig-ai/ludwig

    WEKA - https://www.cs.waikato.ac.nz/ml/weka/ autoweka

    Create ML https://developer.apple.com/documentation/createml

    APAflow https://apaflow.com/?utm_medium=social&utm_source=linkedin&utm_campaign=postfity&utm_content=postfity0b527 https://apaflow.com/

    Monk_Gui-https://github.com/Tessellate-Imaging/Monk_Gui

    FlashML https://www.flash-ml.com/

    JADBio’s https://www.jadbio.com/

    JOHN SNOW LABS https://www.johnsnowlabs.com/models-training-and-active-learning-in-john-snow-labs-annotation-lab/

    igel https://github.com/nidhaloff/igel

    BRYTER https://bryter.com

    Ushur https://ushur.com

    Accern https://accern.com

    Signzy https://signzy.com

    Runway https://runwayml.com

    Fritz AI https://www.fritz.ai

    BigML, Inc https://bigml.com

    MyDataModels https://lnkd.in/eejjDbM

    MonkeyLearn https://monkeylearn.com

    Levity https://levity.ai

    Nanonets https://nanonets.com

    obviously https://www.obviously.ai/

    machine learning straight from Microsoft Excel https://venturebeat.com/2020/12/30/you-dont-code-do-machine-learning-straight-from-microsoft-excel/

    ENNUI-https://math.mit.edu/ennui/ https://github.com/martinjm97/ENNUI https://www.youtube.com/watch?v=4VRC5k0Qs2w

    Knime https://www.knime.com/

    Accord.net http://accord-framework.net/

    DeepDev https://realmichaelye.github.io/DeepDev/deepdev.tech%20-%20Landing%20Page/ https://github.com/realmichaelye/DeepDev

    H2O Driverless AI https://www.h2o.ai/products/h2o-driverless-ai/

    Oracle AutoML https://medium.com/nerd-for-tech/oracles-automl-what-it-is-and-how-it-works-12e09a832c2 https://docs.oracle.com/en-us/iaas/tools/ads-sdk/latest/user_guide/overview/overview.html

    Rapid Miner https://rapidminer.com/

    opennn https://www.opennn.net/

    datarobot https://www.datarobot.com/

    dataiku https://www.dataiku.com/product/get-started/

    orange https://orange.biolab.si/

    Databricks AutoML Automate Machine Learning using Databricks AutoML https://pub.towardsai.net/automate-machine-learning-using-databricks-automl-a-glass-box-approach-and-mlflow-2543a8143687

    OpenBlender https://openblender.io/#/welcome https://analyticsindiamag.com/how-to-use-openblender-the-leading-data-blending-tool/

    create neural networks with one line of code https://github.com/PraneetNeuro/nnio.l

    AWS SageMaker AutoPilot https://aws.amazon.com/sagemaker/autopilot/

    Machine Learning in JUST ONE LINE OF CODE libra https://github.com/Palashio/libra/ https://www.youtube.com/watch?v=N_T_ljj5vc4

    perceptilabs https://towardsdatascience.com/easy-model-building-with-perceptilabs-interactive-tensorflowvisualization-gui-834d5bb3c973

    64.tensorflow development-https://blog.tensorflow.org/

    TensorFlow Hub (trained ready-to-deploy machine learning models in one place) - https://tfhub.dev/

    CrypTFlow: An End-to-end System for Secure TensorFlow Inference https://github.com/mpc-msri/EzPC https://pratik-bhatu.medium.com/privacy-preserving-machine-learning-for-healthcare-using-cryptflow-cc6c379fbab7

    TensorBoard.dev - https://tensorboard.dev/

    tutorials-https://www.tensorflow.org/tutorials https://www.tensorflow.org/guide

    TensorFlow Graphics - https://www.tensorflow.org/graphics Lattice-https://www.tensorflow.org/lattice

    TensorFlow Probability-https://www.tensorflow.org/probability TensorFlow Privacy- tensorflow-privacy

    https://developers.google.com/learn/topics/on-device-ml https://www.tensorflow.org/lite/guide/model_maker https://tfhub.dev/ https://www.tensorflow.org/cloud

    63.Data Science in the Cloud-Amazon SageMaker,Amazon Lex,Amazon Rekognition,Azure Machine Learning (Azure ML) Services,Azure Service Bot framework,Google Cloud AutoML

    64.platforms to build and deploy ML models -Uber has Michelangelo,Google has TFX,Databricks has MLFlow,Amazon Web Services (AWS) has Sagemaker

    66.ML from scratch-https://dafriedman97.github.io/mlbook/content/introduction.html

    https://aihubprojects.com/machine-learning-from-scratch-python/

    https://github.com/python-engineer/MLfromscratch https://www.youtube.com/watch?v=rLOyrWV8gmA

    https://www.datasciencecentral.com/profiles/blogs/a-complete-tutorial-to-learn-data-science-with-python-from

    @mattybv3/learn-data-science-from-scratch-curriculum-with-20-free-online-courses-8cff96d6cbe5"">https://medium.com/@mattybv3/learn-data-science-from-scratch-curriculum-with-20-free-online-courses-8cff96d6cbe5

    67.turn-on visual training for most popular ML algorithms https://github.com/lucko515/ml_tutor https://pypi.org/project/ml-tutor/

    68.mlcourse.ai is a free online- https://mlcourse.ai/

    72.R for Data Science-https://r4ds.had.co.nz/ ,Fundamentals of Data Visualization-https://clauswilke.com/dataviz/

    74.machine learning in JavaScript-https://www.tensorflow.org/js https://www.tensorflow.org/js/models https://tensorflow-js-object-detection.glitch.me/

    TensorFlow.jl Julia with TensorFlow https://malmaud.github.io/tfdocs/ https://malmaud.github.io/TensorFlow.jl/latest/tutorial.html

    Sonnet is a library built on top of TensorFlow 2 https://github.com/deepmind/sonnet

    TensorFlow Federated (TFF) ( facilitate open research and experimentation with Federated Learning)-https://www.tensorflow.org/federated

    TFX is an end-to-end platform for deploying production ML pipelines https://www.tensorflow.org/tfx https://github.com/tensorflow/tfx https://analyticsindiamag.com/guide-to-tensorflow-extendedtfx-end-to-end-platform-for-deploying-production-ml-pipelines/

    Federated Learning -https://www.tensorflow.org/federated/tutorials/federated_learning_for_image_classification

    Neural Structured Learning-https://www.tensorflow.org/neural_structured_learning/tutorials/graph_keras_mlp_cora

    Responsible AI-https://www.tensorflow.org/resources/responsible-ai

    https://www.tensorflow.org/graphics

    75.free list of AI/ Machine Learning Resources/Courses-https://www.marktechpost.com/free-resources/

    https://github.com/kabartay/OpenUnivCourses

    Open ML University https://curriculum.openmlu.com/

    https://www.kdnuggets.com/2018/11/10-free-must-see-courses-machine-learning-data-science.html

    https://www.kdnuggets.com/2018/12/10-more-free-must-see-courses-machine-learning-data-science.html

    https://www.theinsaneapp.com/2020/12/machine-learning-and-data-science-cheat-sheets-pdf.html

    https://www.theinsaneapp.com/2020/11/free-machine-learning-data-science-and-python-books.html

    65 Machine Learning and Data books for free- https://towardsdatascience.com/springer-has-released-65-machine-learning-and-data-books-for-free-961f8181f189

    https://www.deeplearningbook.org/ http://d2l.ai/ https://www.theinsaneapp.com/2020/12/download-free-machine-learning-books.html

    https://www.datasciencecentral.com/profiles/blogs/free-500-page-book-on-applications-of-deep-neural-networks-1 https://github.com/jeffheaton/t81_558_deep_learning

    https://www.theinsaneapp.com/2020/12/free-data-science-books-pdf.html

    https://www.datasciencecentral.com/profiles/blogs/free-500-page-book-on-applications-of-deep-neural-networks-1

    https://github.com/chaconnewu/free-data-science-books

    https://www.kdnuggets.com/2020/03/24-best-free-books-understand-machine-learning.html

    https://www.kdnuggets.com/2020/12/15-free-data-science-machine-learning-statistics-ebooks-2021.html

    http://introtodeeplearning.com/

    https://www.theinsaneapp.com/2020/12/free-data-science-books-pdf.html

    http://d2l.ai/index.html https://www.kdnuggets.com/2020/09/best-free-data-science-ebooks-2020-update.html

    https://www.youtube.com/playlist?app=desktop&list=PLypiXJdtIca5ElZMWHl4HMeyle2AzUgVB https://mit6874.github.io/

    79.For practice -https://www.confetti.ai/exams

    80.Yellowbrick-https://towardsdatascience.com/introduction-to-yellowbrick-a-python-library-to-explain-the-prediction-of-your-machine-learning-d63ecee10ecc

    81.Mathematics of Machine Learning,deep learning-https://towardsdatascience.com/the-mathematics-of-machine-learning-894f046c568

    https://github.com/hrnbot/Basic-Mathematics-for-Machine-Learning

    https://towardsdatascience.com/the-roadmap-of-mathematics-for-deep-learning-357b3db8569b

    https://medium.com/towards-artificial-intelligence/basic-linear-algebra-for-deep-learning-and-machine-learning-ml-python-tutorial-444e23db3e9e

    https://www.kdnuggets.com/2020/02/free-mathematics-courses-data-science-machine-learning.html

    https://towardsai.net/p/data-science/how-much-math-do-i-need-in-data-science-d05d83f8cb19

    https://www.mltut.com/how-to-learn-math-for-machine-learning-step-by-step-guide/

    https://stanford.edu/~shervine/teaching/cs-229/cheatsheet-machine-learning-tips-and-tricks#

    https://www.datasciencecentral.com/profiles/blogs/free-online-book-machine-learning-from-scratch

    https://hadrienj.github.io/posts/Essential-Math-for-Data-Science-Introduction_to_matrices_and_matrix_product/?utm_source=linkedin&utm_medium=social&utm_campaign=linkedin_matrices

    https://www.youtube.com/playlist?list=PLRDl2inPrWQW1QSWhBU0ki-jq_uElkh2a https://github.com/jonkrohn/ML-foundations

    https://ocw.mit.edu/resources/res-18-001-calculus-online-textbook-spring-2005/textbook/

    82.Googleai-https://ai.google/education

    83.ELI5 is a Python package which helps to debug machine learning classifiers and explain their predictions

    PyBrain is a modular Machine Learning Library for Python

    84.Best Online Courses for Machine Learning and Data Science-https://www.mltut.com/best-online-courses-for-machine-learning-and-data-science/

    Comprehensive Project Based Data Science Curriculum https://julienbeaulieu.github.io/2019/09/25/comprehensive-project-based-data-science-curriculum/

    AI Expert Roadmap-https://i.am.ai/roadmap/#data-science-roadmap

    86.Yann LeCun’s Deep Learning Course at CDS-https://cds.nyu.edu/deep-learning/ https://atcold.github.io/pytorch-Deep-Learning/

    https://atcold.github.io/pytorch-Deep-Learning/

    https://www.cs.cmu.edu/~ninamf/courses/601sp15/lectures.shtml

    88.Python Data Science Handbook https://jakevdp.github.io/PythonDataScienceHandbook/

    91.AudioFeaturizer when deal with audio data- https://pypi.org/project/AudioFeaturizer/

    liborsa library https://librosa.org/doc/latest/index.html

    MAGENTA-https://magenta.tensorflow.org/

    pydub https://github.com/jiaaro/pydub

    DDSP: Differentiable Digital Signal Processing https://github.com/magenta/ddsp https://analyticsindiamag.com/guide-to-differentiable-digital-signal-processing-ddsp-library-with-python-code/

    92.Palladium-https://palladium.readthedocs.io/en/latest/

    94.Facebook Open Sourced New Frameworks to Advance Deep Learning Research https://www.kdnuggets.com/2020/11/facebook-open-source-frameworks-advance-deep-learning-research.html

    95.Software Engineering for Machine Learning https://github.com/SE-ML/awesome-seml

    96.Atlas web-based dashboard -https://www.atlas.dessa.com/

    97.Pytest (test code) https://docs.pytest.org/en/latest/index.html (test code)

    98.keras- https://keras.io/ https://keras.io/api/ https://keras.io/examples/

    99.High-Performance Jupyter Notebook - BlazingSQL Notebooks https://blazingsql.com/notebooks

    jupyter-tabnine https://github.com/wenmin-wu/jupyter-tabnine

    101.Kubeflow Machine Learning Toolkit for Kubernetes https://www.kubeflow.org/

    102.Daily AI updates to your inbox- https://sago-ai.news/#/

    103.Three API styles - Sequential Model,functional API,Model subclassing

    104.Deep Learning Toolkit for Medical Image Analysis -https://github.com/DLTK/DLTK

    3 Python Packages for Machine Learning Validation Evidently,Deepchecks,TensorFlow-Data-Validation

    106.Explainability : Model-Specific explainability(Explainability method is strictly relevant to specific model) ,Model-Agnostic explainability ( Explanation to any type model),Model-Centric explainability(most Explanation methods are Model-Centric, as these methods are used to explain how the features and target values are being adjusted),Data-Centric explainability(these methods are used to understand the nature of the data)

    Interpret The ML Model https://towardsdatascience.com/explainable-artificial-intelligence-part-3-hands-on-machine-learning-model-interpretation-e8ebe5afc608

    https://christophm.github.io/interpretable-ml-book/ https://www.kaggle.com/getting-started/209632 https://ex.pegg.io/

    https://neptune.ai/blog/explainability-auditability-ml-definitions-techniques-tools?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-explainability-auditability-ml-definitions-techniques-tools

    shap,lime,Shapash,webshap,ELI5,InterpretML,Concept Relevance Propagation,OmniXAI,Treeinterpreter,Dalex,Eli5,Yellowbrick,Mlxtend,PDPBox,InterpretML,Partial Dependence Plots (PDP), Individual Conditional Expectation (ICE) Plots, Accumulated Local Effects (ALE) Curves and Permutation Importance,Casual shap values,Integrated Gradients,Anchors,Feature importance/attribution,SmoothGrad,DeepLIFT,GradientExplainer,decision tree surrogates,Permutation feature importance,
    xplique,ANCHORS,Permutation Importance,Morris Sensitivity Analysis,Contrastive Explanation Method (CEM),Counterfactual Instances,Global Interpretation via Recursive,Partitioning (GIRP),Protodash,Scalable Bayesian Rule Lists,Tree Surrogates,Explainable Boosting Machine (EBM),DALEX,ALIBI,DiCE,Explainerdashboards,TCAV,PiML,Xplique,Explainer_dashboard,InterpretML,tcav,FeatureImportance,Layerwise Propagation,Surrogate,Explainer Partial Dependence,solas,ferret,Integrated Gradients,DeepLift,Explainable Boosting Machine,Saliency maps,TCAV,Distillation,Counterfactual,interpretML,pdpbox,PyALE,interpret, Fast interpretable,greedy-tree sums,interpretml,imodels,ferret,Counterfactual explanations ,Layerwise Relevance Propagation,Integrated Gradients (IG),Deep LIFT, Saliency,Feature Ablation,Occlusion,captum,Accumulated Local Effects,Anchors,Integrated Gradients,Counterfactuals,GradientShap,FastTreeShap,DeepLift,DeepLiftShap,IntegratedGradients,LayerConductance,NeuronConductance,NoiseTunnel,InterpretML,ALIBI DiCE,interpret-text,aix360,OmniXAI,BreakDown,interpret-text,iml (Interpretable Machine Learning),OmniXAI,Explainerdashboard,InterpretML,ELI5,Netron,DoWhy,CausalNex,explainerdashboard,fairlearn,arviz,Explainability,iNNvestigate,Model Analysis,Permutation feature importance,Partial dependency plots,TE2Rules

    OmniXAI: A Library for eXplainable AI https://github.com/salesforce/OmniXAI

    Xplique is a Neural Networks Explainability Toolbox https://github.com/deel-ai/xplique/

    Ethical-AI Toolkits https://murat-durmus.medium.com/an-brief-overview-of-some-ethical-ai-toolkits-712afe9f3b3a

    ferret python package for benchmarking interpretability techniques https://github.com/g8a9/ferret

    explaining machine learning models https://github.com/SeldonIO/alibi https://github.com/salesforce/OmniXAI https://github.com/SeldonIO/alibi

    Awesome-explainable-AI https://ex.pegg.io/

    tf-explain https://github.com/sicara/tf-explain imodels https://github.com/csinva/imodels

    lime(explain black box models)- https://lime-ml.readthedocs.io/en/latest/ https://towardsdatascience.com/interpreting-image-classification-model-with-lime-1e7064a2f2e5

    SHAP https://medium.com/towards-artificial-intelligence/explain-your-machine-learning-predictions-with-kernel-shap-kernel-explainer-fed56b9250b8

    SHAP and Kernel SHAP,TreeSHAP,shparkley,Shparkley,Deep SHAP,TimeSHAP,PySpark-SHAP,GPUTreeSHAP,FastTreeSHAP: Accelerating SHAP value computation for trees https://github.com/linkedin/fasttreeshap

    https://github.com/slundberg/shap https://www.kdnuggets.com/2020/01/explaining-black-box-models-ensemble-deep-learning-lime-shap.html https://analyticsindiamag.com/hands-on-guide-to-interpret-machine-learning-with-shap/

    fastshap https://github.com/bgreenwell/fastshap

    xplique https://github.com/deel-ai/xplique?utm_source=pocket_mylist

    Shapash makes Machine Learning models transparent and understandable by everyone https://github.com/MAIF/shapash https://www.kdnuggets.com/2021/04/shapash-machine-learning-models-understandable.html

    Captum is a model interpretability and understanding library for PyTorch https://github.com/pytorch/captum

    Explainable AI https://ex.pegg.io/

    Explainable AI dashboards https://github.com/oegedijk/explainerdashboard https://www.youtube.com/watch?v=ZgypAMRcmw8

    interpret https://github.com/interpretml/interpret mlxtend’s http://rasbt.github.io/mlxtend/

    imodels Interpretable ML package https://github.com/csinva/imodels

    Quantus eXplainable AI toolkit https://github.com/understandable-machine-intelligence-lab/quantus

    DiCE Generate Diverse Counterfactual Explanations for any machine learning model. https://github.com/interpretml/DiCE

    tcav https://github.com/tensorflow/tcav yellowbrick https://www.scikit-yb.org/en/latest/quickstart.html

    Language Interpretability Tool https://github.com/pair-code/lit https://ai.googleblog.com/2020/11/the-language-interpretability-tool-lit.html

    Transformers Interpret https://towardsdatascience.com/introducing-transformers-interpret-explainable-ai-for-transformers-890a403a9470 https://github.com/cdpierse/transformers-interpret

    treeinterpreter https://github.com/andosa/treeinterpreter

    Adversarial Explainable AI https://github.com/hbaniecki/adversarial-explainable-ai https://medium.com/responsibleml/adversarial-attacks-on-explainable-ai-f65d41e83c5f

    Captum Model Interpretability for PyTorch https://captum.ai/ https://github.com/pytorch/captum

    ecco https://github.com/jalammar/ecco https://jalammar.github.io/explaining-transformers/ https://www.eccox.io/

    dalex https://pypi.org/project/dalex/ https://blog.learningdollars.com/2021/01/02/ai-in-medical-diagnosis/ https://www.kdnuggets.com/2020/11/dalex-explain-tensorflow-model.html

    google AI Explanations for AI Platform https://cloud.google.com/ai-platform/prediction/docs/ai-explanations/overview?utm_source=youtube&utm_medium=Unpaidsocial&utm_campaign=guo-20200423-Intro-Aiexp

    eli5 https://eli5.readthedocs.io/en/latest/

    Integrated-Gradients https://github.com/ankurtaly/Integrated-Gradients

    xplique https://github.com/deel-ai/xplique/

    TabNet: Attentive Interpretable Tabular Learning https://github.com/dreamquark-ai/tabnet

    skater https://oracle.github.io/Skater/

    lucid https://github.com/tensorflow/lucid/ https://www.kdnuggets.com/2020/04/openai-open-sources-microscope-lucid-library-neural-networks.html

    what if tool https://pair-code.github.io/what-if-tool/ https://pair-code.github.io/what-if-tool/demos/uci.html

    themis https://themis-ml.readthedocs.io/en/latest/

    DeepLIFT https://github.com/kundajelab/deeplift

    Arena https://medium.com/responsibleml/python-has-now-the-new-way-of-exploring-xai-explanations-4248846426cf

    tabnet https://cloud.google.com/blog/products/ai-machine-learning/ml-model-tabnet-is-easy-to-use-on-cloud-ai-platform

    explainerdashboard https://towardsdatascience.com/the-quickest-way-to-build-dashboards-for-machine-learning-models-ec769825070d

    Responsible AI-https://www.tensorflow.org/resources/responsible-ai

    fairlearn https://github.com/fairlearn/fairlearn fairml https://github.com/adebayoj/fairml https://www.datasciencecentral.com/profiles/blogs/fairml-auditing-black-box-predictive-models

    fair https://medium.com/responsibleml/how-to-easily-check-if-your-ml-model-is-fair-2c173419ae4c

    cleverhans https://github.com/cleverhans-lab/cleverhans

    Google Facets https://pair-code.github.io/facets/

    Google’s Model Card Toolkit

    Opening the AI Black Box -https://zetane.com/gallery

    Rulex Explainable AI https://www.rulex.ai/rulex-explainable-ai-xai/

    AI Explainability 360 Toolkit from IBM Research https://aix360.mybluemix.net/ https://analyticsindiamag.com/guide-to-ai-explainability-360-an-open-source-toolkit-by-ibm/

    onnx https://github.com/onnx/onnx

    torch-dreams https://github.com/Mayukhdeb/torch-dreams

    https://github.com/jphall663/awesome-machine-learning-interpretability

    https://analyticsindiamag.com/8-explainable-ai-frameworks-driving-a-new-paradigm-for-transparency-in-ai/

    https://christophm.github.io/interpretable-ml-book/ https://github.com/christophM/interpretable-ml-book

    https://www.kdnuggets.com/2018/12/machine-learning-explainability-interpretability-ai.html https://www.kdnuggets.com/2019/09/python-libraries-interpretable-machine-learning.html https://www.kdnuggets.com/2019/08/open-black-boxes-explainable-machine-learning.html

    Fairness https://analyticsindiamag.com/building-a-responsible-ai-eco-system/

    How to easily check if your Machine Learning model is fair (dalex) https://www.kdnuggets.com/2020/12/machine-learning-model-fair.html

    TensorFlow Federated,TensorFlow Model Remediation,TensorFlow Privacy,LinkedIn Fairness Toolkit,Fairlearn,AI Fairness 360,Responsible AI Toolbox,XAI,scikit-fairness,Fairlead,Algofairness,Aequitas,CERTIFAI,ML-fairness-gym,Algofairness,FairSight,GD-IQ,scikit-fairness,Mitigating Gender Bias In Captioning System,Model Card Toolkit,AI Fairness 360, AI Explainability 360, Adversarial Robustness 360, Uncertainty Quantification 360, AI Privacy 360, Causal Inference 360, and AI FactSheets 360,Deon,Responsible AI Toolbox,DALEX,TensorFlow Data Validation,XAI,Fawkes,AdverTorch,solasai,Fawkes,Gluru,AdverTorch,Conversica,Quill AI,Fairness 360,Fairlead, TextAttack,Themis-ML,Debiaswe,fairness-in-ml,bias-correction,BlackBoxAuditing,fairness-indicators,Awesome-Fairness-in-AI

    https://analyticsindiamag.com/guide-to-ai-fairness-360-an-open-source-toolkit-for-detection-and-mitigation-of-bias-in-ml-models/

    107.deep-learning-drizzle -https://deep-learning-drizzle.github.io/

    108.Machine Learning University - https://aws.amazon.com/machine-learning/mlu/

    109.Continuous Machine Learning (CML),OpenMLOps,Metaflow,Kubeflow,Data Version Control (DVC),Kedro

    mlflow https://mlflow.org/ An open source platform for the machine learning lifecycle

    Layer https://docs.app.layer.ai/docs/

    https://www.kdnuggets.com/2021/01/5-tools-effortless-data-science.html

    https://neptune.ai/

    https://azure.microsoft.com/en-us/services/machine-learning/

    https://github.com/VertaAI/modeldb

    110.Data Preparation / ETL https://airflow.apache.org/ https://intake.readthedocs.io/en/latest/

    111.fairlearn https://github.com/fairlearn/fairlearn/blob/master/README.md Evaluating fairness of AI/ML models and training data and for mitigating bias in models
    determined to be unfair.

    AI Fairness 360 evaluating fairness of AI/ML models and training data and mitigating bias in current models https://aif360.mybluemix.net/

    An ethics checklist for data scientists https://deon.drivendata.org/

    112.https://analyticsindiamag.com/top-6-ai-powered-drug-discovery-tools-in-2021/

    MONAI Framework For Medical Imaging Research https://analyticsindiamag.com/monai-datatsets-managers/

    torchio https://github.com/fepegar/torchio https://analyticsindiamag.com/torchio-3d-medical-imaging/

    MolBert: Molecular Representation learning with AI

    medicalAI https://github.com/aibharata/medicalAI

    Biopython is a set of freely available tools https://github.com/biopython/biopython

    DeepIPW https://github.com/ruoqi-liu/DeepIPW

    113.OpenVINO https://opencv.org/openvino-model-optimization/ https://opencv.org/how-to-speed-up-deep-learning-inference-using-openvino-toolkit-2/

    114.https://neptune.ai/blog/machine-learning-model-management https://analyticsindiamag.com/top-mlops-tools-github-repos/ https://neu.ro/2021-mlops-platforms-vendor-analysis-report/

    Best Workflow and Pipeline Orchestration Tools https://neptune.ai/blog/best-workflow-and-pipeline-orchestration-tools?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-best-workflow-and-pipeline-orchestration-tools

    MLflow vs Kubeflow vs Neptune https://neptune.ai/blog/mlflow-vs-kubeflow-vs-neptune-differences?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-mlflow-vs-kubeflow-vs-neptune-differences

    15 MLOps.toys https://mlops.toys/ AIOps,Data version control DVC,MLFlow,Docker foundation,Kubernetes Foundation,Tensorflow Extend (TFX),Kubeflow,AWS AIOps,Azure AIOps,MLflow and TensorBoard ,Weights & Biases, Neptune AI, Comet,aim

    Data verification:Scale Nucleus,great_expectation,Soda Data Observability

    Metadata management:Neptune.ai,SiaSearch,Tensorflow’s ML MetaData

    Data management:Neptune,DVC,RoboFlow,Dataiku,Apache Airflow, Apache NiFi, Apache Kafka

    Feature Stores : Amazon SageMaker Feature Store,Databricks,Hopsworks.ai,Vertex AI,FeatureForm,FeastTecton,butterfree,ByteHub

    Data Quality:whylogs,eurybia

    Detecting data drift and model drift:eurybia

    Experiment tracking :Kedro,modeldb,mlflow,DVC,weight and biases,Neptune,clearly,tensorboard,determined,polyaxon,mlrun,Comet,Sacred,TensorBoard,DagsHub,Guild AI,ClearML,Valohai,Pachyderm,Verta.ai,Kubeflow,SageMaker Studio,sacred

    Monitoring: Prometheus, Grafana, ELK Stack

    Data versioning:Dolt,DVC,gitlfs,pachyderm, Git LFS,lakefs,DVC,weight and biases,Neptune,Comet,Delta Lake

    Data Governance: Collibra, Alation, Informatica

    Data Quality: Trifacta, Talend, Informatica

    Code versioning: Gitlab,github,SVN

    Model Versioning :Neptune,ModelDB,DVC,MLFlow,Pachyderm,Polyaxon

    Pipeline orchestration:Kale,Apche airflow,Argo,workflows,Luigi,kubeflow,kedro,nextflow,dragster,Apache,bean,zenml,flute,prefect,ray,DVC,polyaxon,clearml,mlrun,pachyderm,Metaflow,Couler,Valohai,Dagster.io

    Runtime engine:Ray,nuclio,dask,horovod,Apache,spark

    Data orchestration prefect,kale,mlru,dagster,kedro,airflow

    Artifact tracking:Kubeflow,mlflow,weight and biases,Neptune,polyaxon,clearml,mlrun,pachyderm

    Model registry:Modeldb,mlflow,determined,weight and biases,Neptune,clearml,mlrun, Vision AI,DINO,Amazon Rekognition

    Model serving:Seldon,core,bentoml,tensorflow serving,kserve,fastapi,torchserve,ray,mlflow,clearml,mlrun,pymlpipe,TorchServe,TensorFlow Serving,Kubeflow,Cortex,Seldon.ai,ForestFlow,bentoml

    Model monitoring:Evidently,WhyLabs,grafana,alibi,detect,modeldb,clearml,mlrun,prometheus,pymlpipe,NannyML,Aporia,eurybia,Arize,Fiddler,Amazon SageMaker Model Monitor,Prometheus,Qualdo,Neptune,Grafana + Prometheus ,Qualdo,Seldon Core,Censius

    Model Performance Tracking: TensorBoard, MLflow, Comet.ml

    Continuous Integration: Jenkins, Travis CI, CircleCI

    Continuous Deployment: Jenkins, Travis CI, CircleCI

    Containerization: Docker, Kubernetes

    Configuration Management: Ansible, Puppet, Chef

    data validation:Pydantic,eurybia

    model testing: Deepchecks,Neptune,Mona ,Grafana + Prometheus

    Model Security: Seldon, OpenVino, TensorFlow Privacy

    Continuous Integration and Continuous Deployment (CI/CD) Tools for Machine Learning : CML ,GitHub Actions,GitLab for CI/CD,Jenkins,TeamCity,Circle CI,Travis CI,

    aim https://github.com/aimhubio/aim

    Metaflow,MLReef,MLRun,ZenML,MLflow,Seldon,Bodywork,Pachyderm,DVC, or Data Version Control

    MLOps https://analyticsindiamag.com/8-projects-to-kickstart-your-mlops-journey-in-2021/

    Open MLOps https://github.com/datarevenue-berlin/OpenMLOps

    Best Tools for Tracking Machine Learning Experiments https://neptune.ai/blog/best-ml-experiment-tracking-tools

    mlops-https://github.com/visenger/awesome-mlops

    mlflow https://towardsdatascience.com/get-started-with-mlops-fd7062cab018

    GuildAI https://guild.ai/ https://github.com/guildai/guildai

    MLOPS https://www.analyticsinsight.net/top-mlops-based-tools-for-enabling-effective-machine-learning-lifecycle/ https://neptune.ai/blog/best-mlops-tools

    ML-Model-CI https://github.com/cap-ntu/ML-Model-CI

    Easy MLOps with PyCaret + MLflow https://www.kdnuggets.com/2021/05/easy-mlops-pycaret-mlflow.html

    https://www.kdnuggets.com/2021/03/overview-mlops.html https://medium.com/prosus-ai-tech-blog/towards-mlops-technical-capabilities-of-a-machine-learning-platform-61f504e3e281

    omegaml https://github.com/omegaml/omegaml

    https://neptune.ai/blog/8-best-data-science-and-machine-learning-platforms-for-mlops?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-8-best-data-science-and-machine-learning-platforms-for-mlops

    https://neptune.ai/blog/ml-model-monitoring-best-tools?utm_source=email&utm_medium=newsletter&utm_campaign=blog-march&utm_content=ml-model-monitoring-best-tools

    https://neptune.ai/blog/end-to-end-mlops-platforms?utm_source=email&utm_medium=newsletter&utm_campaign=blog-march&utm_content=end-to-end-mlops-platforms

    https://neptune.ai/blog/mlops-at-greensteam-shipping-machine-learning-case-study?utm_source=email&utm_medium=newsletter&utm_campaign=blog-march&utm_content=mlops-at-greensteam-shipping-machine-learning-case-study

    https://neptune.ai/blog/mlops-10-best-practices?utm_source=email&utm_medium=newsletter&utm_campaign=blog-march&utm_content=mlops-10-best-practices

    https://neptune.ai/blog/machine-learning-model-management?utm_source=email&utm_medium=newsletter&utm_campaign=blog-march&utm_content=machine-learning-model-management

    https://mlops.githubapp.com/ https://about.mlreef.com/blog/global-mlops-and-ml-tools-landscape https://github.com/paiml/practical-mlops-book

    https://olympus.greatlearning.in/courses/12956?_gl=1*ljadx1*_ga*NjMxNjUxNjM2LjE2MDYyMDYzNDM.*_ga_TH52C020P8*MTYxMTIyOTQ0MS40Ny4wLjE2MTEyMjk0NDEuNjA.

    https://docs.microsoft.com/en-us/azure/architecture/example-scenario/mlops/mlops-technical-paper https://neptune.ai/blog/end-to-end-mlops-platforms

    https://github.com/kelvins/awesome-mlops#hyperparameter-tuning

    ClearML https://analyticsindiamag.com/guide-to-clearml-zero-integration-mlops-solution/

    https://neptune.ai/blog/mlops-what-it-is-why-it-matters-and-how-to-implement-it-from-a-data-scientist-perspective?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-mlops-what-it-is-why-it-matters-and-how-to-implement-it-from-a-data-scientist-perspective

    https://ml-ops.org/content/mlops-principles

    Monitoring: Evidently https://evidentlyai.com/ , Seldon Alibi https://github.com/SeldonIO/alibi-detect

    115.Code faster https://www.tabnine.com/

    117.https://www.pye.ai/2021/03/19/machine-learning-model-management-what-why-and-how/ https://www.ambiata.com/blog/2020-12-07-mlops-tools/

    Pachyderm Kubeflow MLflow Metaflow ZenML Seldon Bodywork MLReef MLRun DVC katana-skipper Weights & Biases Valohai Polyaxon Neptune.ai CometML Algorithmia clearml, airflow, kedro, GitHub Actions Flyte Valohai Seldon Iguazio Datarobot Dataiku cnvrg.io ClearML AWS Sagemaker wandb evidently

    BentoML Unified Model Serving Framework https://github.com/bentoml/BentoML

    mlflow https://mlflow.org/docs/latest/index.html https://github.com/amesar/mlflow-examples

    MLFlow by pycaret https://pycaret.org/mlflow/?utm_medium=social&utm_source=linkedin&utm_campaign=postfity&utm_content=postfity2c1c2

    labml https://ramith.fyi/tracking-your-ml-experiments-without-sending-data-to-the-cloud/

    MLOps https://github.com/microsoft/MLOps https://mlops.githubapp.com/ https://huyenchip.com/2020/12/30/mlops-v2.html https://github.com/paiml/practical-mlops-book https://analyticsindiamag.com/top-10-tools-to-kickstart-your-mlops-journey-in-2021

    mlops platform SageMaker on Amazon,Data Lab,Domino,H2O MLOps,Cloudera,Data Platform,Kubeflow,MLFlow,Metaflow,Flyte,ZenML,MLRun,Algorithmia,Dataiku,DataRobot,Pachyderm,Databricks,Lakehouse,Neptune.ai

    7 Best Resources To Learn MLOps In 2021 https://analyticsindiamag.com/7-best-resources-to-learn-mlops-in-2021/

    DevOps https://github.com/collections/devops-tools

    airflow https://github.com/apache/airflow

    kubeflow https://github.com/kubeflow/kubeflow

    kubernetes https://github.com/kubernetes/kubernetes

    Metaflow https://metaflow.org/ https://github.com/Netflix/metaflow

    pipeline https://scikit-learn.org/stable/modules/generated/sklearn.pipeline.Pipeline.html

    Tensorflow Extended https://www.tensorflow.org/tfx Tensorflow Transform https://www.tensorflow.org/tfx/transform/get_started

    https://aniruddha-choudhury49.medium.com/mlops-kubeflow-with-tensorflow-tfx-pipelines-seamlessly-and-at-scale-92b432bd39b0

    Serving Models https://www.tensorflow.org/tfx/guide/serving

    Tensorflow Data Validation https://www.tensorflow.org/tfx/data_validation/get_started TensorFlow Model Analysis https://www.tensorflow.org/tfx/model_analysis/get_started

    Model Validation Toolkit https://finraos.github.io/model-validation-toolkit/ https://github.com/FINRAOS/model-validation-toolkit

    MLflow Open-source platform for tracking machine learning experiments https://mlflow.org/ https://analyticsindiamag.com/guide-to-mlflow-a-platform-to-manage-machine-learning-lifecycle/ https://www.kdnuggets.com/2021/01/model-experiments-tracking-registration-mlflow-databricks.html

    ray https://docs.ray.io/en/master/serve/ https://github.com/ray-project/ray

    https://medium.com/distributed-computing-with-ray/ray-mlflow-taking-distributed-machine-learning-applications-to-production-103f5505cb88

  1. Feature Stores https://neptune.ai/blog/feature-stores-components-of-a-data-science-factory-guide?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-feature-stores-components-of-a-data-science-factory-guide

    Top 10 Leading Machine Learning Feature Stores https://www.pye.ai/2021/05/14/top-10-machine-learning-feature-store-systems/

    118.algorithm to use by problem https://www.datasciencecentral.com/profiles/blogs/which-machine-learning-deep-learning-algorithm-to-use-by-problem

    119.Connect the world to your data and fuel your ML.

    OpenBlender Enrich ML Models with adding new Variables from Any Source to Boost Performance https://www.youtube.com/channel/UCCFN8DDrA6k7eHYLvZGdNVA https://openblender.io/

  2. Google’s MuRIL (Multilingual Representations for Indian Languages) https://tfhub.dev/google/MuRIL/1

    121.mxnet https://mxnet.apache.org/versions/master/api/python/docs/tutorials/getting-started/crash-course/index.html

    122.tools-https://towardsdatascience.com/data-science-tools-f16ecd91c95d

    123.Elements of AI free online course https://www.elementsofai.com/

    124.Best_AI_paper_2020 https://github.com/louisfb01/Best_AI_paper_2020

    125.roadmap https://github.com/graykode/nlp-roadmap https://www.theinsaneapp.com/2021/03/roadmap-series.html

    https://www.freecodecamp.org/news/data-science-learning-roadmap/ https://www.kdnuggets.com/2020/12/roadmaps-ai-developer-data-scientist-machine-learning-engineer.html

    https://mohammedazeem665.medium.com/plan-to-learn-machine-learning-data-science-in-2021-note-these-assets-from-2020-e84389d94097

    https://github.com/AMAI-GmbH/AI-Expert-Roadmap

    https://becominghuman.ai/cheat-sheets-for-ai-neural-networks-machine-learning-deep-learning-big-data-678c51b4b463

    data-engineer-roadmap https://github.com/datastacktv/data-engineer-roadmap

    126.https://neptune.ai/blog/best-data-science-tools-to-increase-machine-learning-model-understanding?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-best-data-science-tools-to-increase-machine-learning-model-understanding

    Visualizing the Execution of Python Program http://pythontutor.com/ https://www.youtube.com/watch?v=pCSlWQjfCzA

    MLPerf Model performance debugging tools https://mlperf.org/

    Model debugging tools Manifold https://eng.uber.com/manifold/

    Pytest for Data Scientists https://towardsdatascience.com/4-lessor-known-yet-awesome-tips-for-pytest-2117d8a62d9c

    Icecream https://towardsdatascience.com/stop-using-print-to-debug-in-python-use-icecream-instead-79e17b963fcc

    Experiment tracking tools WandB https://wandb.ai/site

    Comet manage and organize machine learning experiments https://www.comet.ml/site/ https://analyticsindiamag.com/how-to-supercharge-your-machine-learning-experiments-with-comet-ml/

    neptune https://neptune.ai/ https://analyticsindiamag.com/how-to-manage-ml-experiments-with-neptune-ai/

    weights & biases https://wandb.ai/site https://analyticsindiamag.com/hands-on-guide-to-weights-and-biases-wandb-with-python-implementation/ https://docs.wandb.ai/

    https://www.kdnuggets.com/2020/07/tour-end-to-end-machine-learning-platforms.html

    127.19 Best JupyterLab Extensions for Machine Learning https://neptune.ai/blog/jupyterlab-extensions-for-machine-learning

    128.coreml https://developer.apple.com/machine-learning/core-ml/

    129.Protect Your Neural Networks Against Hacking Adversarial Robustness Toolbox (ART) https://analyticsindiamag.com/adversarial-robustness-toolbox-art/

    130.https://www.kdnuggets.com/2021/01/10-underappreciated-python-packages-machine-learning-practitioners.html

    131.datascience-fails https://github.com/xLaszlo/datascience-fails

    132.Jupyter notebook integration for Microsoft Excel https://github.com/pyxll/pyxll-jupyter https://towardsdatascience.com/python-jupyter-notebooks-in-excel-5ab34fc6439

    Voilà turns Jupyter notebooks into standalone web applications https://github.com/voila-dashboards/voila https://github.com/voila-dashboards/voila-gridstack

    How to Optimize Your Jupyter Notebook https://www.kdnuggets.com/2020/01/optimize-jupyter-notebook.html

    TabNet: Attentive Interpretable Tabular Learning https://github.com/dreamquark-ai/tabnet

    133.rapidly develop data applications with Python https://github.com/dstackai/dstack

    134.Google Research: Looking Back at 2020, and Forward to 2021 https://ai.googleblog.com/2021/01/google-research-looking-back-at-2020.html

    135.cortex Run inference at scale https://www.cortex.dev/ https://github.com/cortexlabs/cortex

    136.https://www.theinsaneapp.com/2020/12/machine-learning-and-data-science-cheat-sheets-pdf.html

    137.Federated Learning Systems

    Flower – A Framework To Build Federated Learning Systems https://github.com/adap/flower https://flower.dev/

    138.https://analyticsindiamag.com/top-ai-powered-writing-assistants-to-create-better-content/

    139.Tensorflow Data Validation - Data Analysis At Scale https://www.youtube.com/watch?v=eGIG_qHgQ08

    140.SciKeras https://scikeras.readthedocs.io/en/latest/#

    141.debugging Data viewer https://devblogs.microsoft.com/python/python-in-visual-studio-code-january-2021-release/

    142.Machine Learning Lifecycle in 2021 https://towardsdatascience.com/the-machine-learning-lifecycle-in-2021-473717c633bc

    143.Introduction To ML.NET – An ML Framework For DOTNET Developers https://analyticsindiamag.com/introduction-to-ml-net-a-machine-learning-framework-for-dotnet-developers/

    https://analyticsindiamag.com/step-by-step-guide-for-image-classification-using-ml-net/

    144.https://www.perceptilabs.com/home http://deeplearninggallery.com/ https://www.kdnuggets.com/2019/01/practical-apache-spark-10-minutes.html

    145.https://www.kdnuggets.com/2018/09/machine-learning-cheat-sheets.html https://www.kdnuggets.com/2018/09/meverick-lin-data-science-cheat-sheet.html

    https://www.kdnuggets.com/2018/08/data-visualization-cheatsheet.html https://www.kdnuggets.com/2018/07/sql-cheat-sheet.html https://www.kdnuggets.com/2018/04/python-regular-expressions-cheat-sheet.html https://www.kdnuggets.com/2017/09/essential-data-science-machine-learning-deep-learning-cheat-sheets.html

    https://www.analyticsvidhya.com/blog/2021/01/5-python-packages-every-data-scientist-must-know/

    https://www.kdnuggets.com/2021/01/ultimate-scikit-learn-machine-learning-cheatsheet.html https://www.kdnuggets.com/2020/09/10-things-know-scikit-learn.html

    146.Data Pipelines https://www.kdnuggets.com/2018/05/beginners-guide-data-science-pipeline.html https://www.kdnuggets.com/2019/03/data-pipelines-luigi-airflow-everything-need-know.html

  3. AI Habitat: A Platform For Embodied AI Research https://analyticsindiamag.com/hands-on-guide-to-ai-habitat-a-platform-for-embodied-ai-research/

    149.onnx https://medium.com/towards-artificial-intelligence/onnx-for-model-interoperability-faster-inference-8709375db9bf

    152.Best ML Frameworks & Extensions for Scikit-learn https://neptune.ai/blog/the-best-ml-framework-extensions-for-scikit-learn?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-the-best-ml-framework-extensions-for-scikit-learn

    153.Multimodal Neurons, The Most Advanced Neural Networks Discovered By OpenAI https://analyticsindiamag.com/inside-multimodal-neurons-the-most-advanced-neural-networks-discovered-by-openai/

    154.TensorGram https://github.com/ksdkamesh99/TensorGram https://www.youtube.com/watch?v=ItDBQB4YFuI

    knockknock https://towardsdatascience.com/how-to-get-notified-when-your-model-is-done-training-with-knockknock-483a0475f82c

    labmi Organize machine learning experiments and monitor training progress from mobile https://labml.ai/

    WeightWatcher https://github.com/CalculatedContent/WeightWatcher

    labml Monitor deep learning model training and hardware usage from your mobile phone https://labml.ai/ https://github.com/labmlai/labml

    ml notify https://github.com/aporia-ai/mlnotify

    155.r packages https://upurl.me/vkf3r http://r-bloggers.com/2021/04/15-essential-packages-in-r-for-data-science/ https://www.ubuntupit.com/best-r-machine-learning-packages/

    Top 10 Free Resources To Learn R https://analyticsindiamag.com/top-10-free-resources-to-learn-r/

    https://bluemind1988.medium.com/explore-r-libraries-for-end-to-end-data-science-projects-b4d0af3a9f5c

    analyticsvidhya.com/blog/2021/04/top-10-r-packages-for-data-science-you-must-know-in-2021/

    156.Top Julia Libraries for Machine Learning https://www.analyticsvidhya.com/blog/2021/05/top-julia-machine-learning-libraries/

    156.openblender Fuel your ML Engines with Relevant Data to Boost Performance https://openblender.io/#/welcome

    157.all Domain-based A.I. Platform for Data Scientists https://www.cluzters.ai/

    158.2D images to 3D https://analyticsindiamag.com/python-guide-to-neural-body-converting-2d-images-to-3d/

    Open3D: An Open Source Modern Library For 3D Data Processing https://github.com/intel-isl/Open3D

    160.https://gallery.allennlp.org/ https://prior.allenai.org/projects/gpv

    161.NVIDIA Unveils 50+ New, Updated AI Tools and Trainings for Developers https://www.hpcwire.com/off-the-wire/nvidia-unveils-50-new-updated-ai-tools-and-trainings-for-developers/

    162.Best Workflow and Pipeline Orchestration Tools https://neptune.ai/blog/best-workflow-and-pipeline-orchestration-tools?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-best-workflow-and-pipeline-orchestration-tools

    164.notes Data Science & Machine Learning https://chrisalbon.com/

    165.black uncompromising Python code formatter https://github.com/psf/black

    166.Feature stores https://www.kdnuggets.com/2021/05/feature-stores-how-avoid-feeling-every-day-is-groundhog-day.html https://neptune.ai/blog/feature-stores-components-of-a-data-science-factory-guide?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-feature-stores-components-of-a-data-science-factory-guide

    167.Code and Notebook Versioning for ML Teams https://neptune.ai/blog/code-and-notebook-versioning-for-ml-teams-guide?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-code-and-notebook-versioning-for-ml-teams-guide

    10 tools that can serve as a great alternative to the different parts of ClearML https://neptune.ai/blog/clear-ml-alternatives?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-clear-ml-alternatives

    168.3 Tools to Track and Visualize the Execution of your Python Code https://towardsdatascience.com/3-tools-to-track-and-visualize-the-execution-of-your-python-code-666a153e435e

Follow leaders in the field to update yourself in the field

  1. 1.Linkedin
  2. 2.Twitter

CPU/GPU/TPU

  1. 1.Google cloab (FREE) Jupyter Lab for Python, R, Swift from Google Colab with ColabCode https://www.youtube.com/watch?v=Q35WIqZoUF4
  2. https://www.analyticsvidhya.com/blog/2021/01/avid-user-of-google-colab-here-are-some-alternatives-of-google-colab-that-you-should-know-about/?utm_source=linkedin&utm_medium=social&utm_campaign=old-blog&utm_content=B&custom=FBI156
  3. https://towardsdatascience.com/use-colab-more-efficiently-with-these-hacks-fc89ef1162d8 https://www.analyticsvidhya.com/blog/2021/05/10-colab-tips-and-hacks-for-efficient-use-of-it/
  4. ColabCode This is an amazing extension to the already available resource, Google Colab https://github.com/abhi1thakur/colabcode
  5. GitHub notebooks with Google Colab https://www.youtube.com/watch?v=LmIylxNmA-A&feature=youtu.be
  6. colab_everything Python library to run streamlit, flask, fastapi, etc on google colab https://github.com/Ankur-singh/colab_everything/
  7. 2.Kaggle kernel(read terms and conditions before use) (FREE)
  8. 3.Paperspace Gradient(read terms and conditions before use)
  9. 4.knime - https://www.knime.com/(read terms and conditions before use)
  10. 5.RapidMiner (read terms and conditions before use)
  11. https://github.com/zszazi/Deep-learning-in-cloud
  12. 6.saturncloud https://saturncloud.io/
  13. Intel Jupyter Lab,Amazon Sagemaker,Binder,DeepNote,Hex,DataBricks Notebook,Jetbrains Datalore,DataCamp Workspace,Notablejournal,Notable,Observable,CoCalc,Replit,Binder,IBM DataPlatform Notebooks,CodeSandbox,StackBlitz

So what next ?

participate online competition and do project and apply to intership ,job,solving real world problems, etc…

applications of data science in many industry

  1. 1.E-commerce- Identifying consumers,Recommending Products,Analyzing Reviews
  2. 2.Manufacturing- Predicting potential problems,Monitoring systems,Automating manufacturing units, Maintenance Scheduling,Anomaly Detection
  3. 3.Banking- Fraud detection,Credit risk modeling,Customer lifetime value
  4. 4.Healthcare- Medical image analysis, Drug discovery,Bioinformatics,Virtual Assistants,image segmentation
  5. 5.Transport- Self-driving cars,Enhanced driving experience,Car monitoring system,Enhancing the safety of passengers
  6. 6.Finance- Customer segmentation,Strategic decision making,Algorithmic trading,Risk analytics
  7. 7.Marketing (Added from comments Credits: Jawad Ali)- LTV predictions,Predictive analytics for customer behavior,Ad targeting
  8. and many more fields - https://www.topbots.com/enterprise-ai-companies-2020/ , https://venturebeat.com/2020/10/21/the-2020-data-and-ai-landscape/

Research blogs https://www.theinsaneapp.com/2021/04/top-machine-learning-blogs-to-follow-in-2021.html

Explainpaper https://www.explainpaper.com/

https://reconshell.com/top-ai-and-machine-learning-blogs-curated-for-ai-enthusiasts/

1.https://ai.facebook.com/ https://ai.facebook.com/blog/

2.https://ai.googleblog.com/

3.https://deepmind.com/blog https://deepai.org/definitions

4.https://openai.com/blog/

5.https://www.malongtech.com/en/research.html

6.https://blogs.nvidia.com/blog/tag/artificial-intelligence/ https://blogs.nvidia.com/

https://ai.googleblog.com/2021/01/google-research-looking-back-at-2020.html?m=1

7.https://blog.tensorflow.org/

8.https://pytorch.org/blog/

9.https://distill.pub/

kdnuggets.com

https://www.kdnuggets.com/2020/01/top-10-ai-ml-articles-to-know.html

RESEARCH LABS IN THE WORLD

https://ai.facebook.com/ https://ai.googleblog.com/ https://research.google/ https://ai.google/research/

1.The Alan Turing Institute:https://www.turing.ac.uk/

2.J.P. Morgan AI Research Lab:https://www.jpmorgan.com/insights/tec

3.Oxford ML Research Group:http://www.robots.ox.ac.uk/~parg/proj

4.Microsoft Research Lab- AI:https://www.microsoft.com/en-us/resea

5.Berkeley AI Research:https://bair.berkeley.edu/

6.LIVIA:https://en.etsmtl.ca/Unites-de-recher

7.MIT Computer Science and Artificial :https://www.csail.mit.edu/

online competitions:

Top 25 Machine Learning Hackathons https://medium.com/analytics-vidhya/top-25-machine-learning-hackathons-its-here-now-for-anyone-to-move-to-data-science-a93deb2a198a

1.Kaggle-https://www.kaggle.com/

kaggle-solutions https://github.com/faridrashidi/kaggle-solutions

2.hackerearth-https://www.hackerearth.com/challenges/

3.machinehack-https://www.machinehack.com/

4.analyticsvidhya-https://datahack.analyticsvidhya.com/contest/all/

5.zindi-https://zindi.africa/competitions

6.crowdai-https://www.crowdai.org/

7.driven data-https://www.drivendata.org/

8.dockship-https://dockship.io/Runway AI

9.SIGNATE Competition- https://signate.jp/about?rf=competition_about

9.International Data Analysis Olympiad (IDAHO)

10.Codalab

11.Iron Viz

12.Data Science Challenges

13.Tianchi Big Data Competition

14.https://www.techgig.com/hackathon/ml_hackathon

15.https://www.openml.org/

https://towardsdatascience.com/12-data-science-ai-competitions-to-advance-your-skills-in-2021-32e3fcb95d8c https://www.kdnuggets.com/2020/09/international-alternatives-kaggle-data-science-competitions.html

Some useful content :

  1. H20.ai automl, google automl,Google Cloud AutoML,google ml kit(https://developers.google.com/ml-kit) ,Azure Cognitive Services,Azure Machine Learning Service,amazon ml,Azure Machine Learning Studio,Google Cloud Platform,gcp automl ision,Weka,AutoWeka,Microsoft Cognitive Toolkit,Google Cloud AutoML,DataRobot AutoML,Databricks AutoML,Azure ML,azure machine learning studio,IBM Watson ml studio,AWS Sagemaker Studio,aws rekognition,Google AI Platform,Databricks,Domino Data Lab,roboflow,Qlik AutoML,NVIDIA TAO

H2O Driverless AI https://www.h2o.ai/products/h2o-driverless-ai/

H2O Flow - Web Based Machine Learning Development https://docs.h2o.ai/h2o/latest-stable/h2o-docs/flow.html https://www.analyticsvidhya.com/blog/2021/05/a-step-by-step-guide-to-automl-with-h2o-flow/

https://docs.microsoft.com/en-us/azure/machine-learning/algorithm-cheat-sheet

https://neptune.ai/blog/best-machine-learning-as-a-service-platforms-mlaas?utm_source=twitter&utm_medium=tweet&utm_campaign=blog-best-machine-learning-as-a-service-platforms-mlaas

https://codegnan.com/blog/35-best-data-sciecne-tools-for-beginners-to-master/ https://analyticsindiamag.com/free-online-resources-to-learn-automl/

https://analyticsindiamag.com/10-popular-automl-tools-developers-can-use/ https://analyticsindiamag.com/8-best-open-source-tools-for-data-mining/

mlkit-https://firebase.google.com/products/ml runway https://runwayml.com/ fritz https://www.fritz.ai/

obviously https://www.obviously.ai/ createml https://developer.apple.com/machine-learning/create-ml/ makeml https://makeml.app/

superannotate https://superannotate.com/ https://rapidminer.com/ https://monkeylearn.com/monkeylearn-studio/ https://nanonets.com/

GCP Professional ML Engineer certification in 8 days https://ml-rafiqhasan.medium.com/how-i-cracked-the-gcp-professional-ml-engineer-certification-in-8-days-f341cf0bc5a0

Vertex AI, one platform, every ML tool you need https://cloud.google.com/vertex-ai

2.FasterAI,keras,fastai,tesorflow,pytorch

Automated model architecture search tools (e.g. darts, enas) https://awesomeopensource.com/projects/automl

https://github.com/search?q=automl https://www.kdnuggets.com/2016/03/automated-data-science.html https://www.kdnuggets.com/software/automated-data-science.html

Tpot https://github.com/EpistasisLab/tpot

ATOM https://github.com/tvdboom/ATOM https://towardsdatascience.com/how-to-test-multiple-machine-learning-pipelines-with-just-a-few-lines-of-python-1a16cb4686d

mljar-supervised https://github.com/mljar/mljar-supervised

libra end-to-end machine learning process in just one line of code https://github.com/Palashio/libra

featurewiz, boruta_py ,AutoWeka,Auto-Sklearn,AutoGluon,Auto-PyTorch,AutoKeras,auto-tensorflow,Ludwig,MLBox,PyCaret,LightAutoML,FLAML,EvalML,H2O AutoML

GML https://github.com/Muhammad4hmed/GML

auto_ml https://github.com/ClimbsRocks/auto_ml

automl-gs Automating Machine Learning In A Single Line Of Code https://github.com/minimaxir/automl-gs

paddlehub Performing Computer Vision & NLP Tasks in a Single Of Code https://github.com/PaddlePaddle/PaddleHub

pywedge https://github.com/taknev83/pywedge https://towardsdatascience.com/automated-interactive-package-for-eda-modeling-and-hyperparameter-tuning-in-a-few-lines-of-228c561fa63c

LightAutoML https://github.com/sberbank-ai-lab/LightAutoML https://lightautoml.readthedocs.io/en/latest/ https://towardsdatascience.com/lightautoml-preset-usage-tutorial-2cce7da6f936

FLAML fast and lightweight AutoML library https://github.com/microsoft/FLAML

LightAutoML LAMA - automatic model creation framework https://github.com/sberbank-ai-lab/LightAutoML

H2O Hydrogen Torch: A No-code Deep Learning Framework

EvalML is an AutoML library https://github.com/alteryx/evalml https://evalml.alteryx.com/en/stable/ https://www.kdnuggets.com/2021/04/easy-automl-python.html https://www.youtube.com/watch?v=uuYEQqrExBQ https://www.analyticsvidhya.com/blog/2021/05/machine-learning-automation-using-evalml-library/

dataprep Beginners Guide to Automation in Data Science https://www.analyticsvidhya.com/blog/2021/04/beginners-guide-to-automation-in-data-science/

A machine learning tool for automated prediction engineering https://github.com/alteryx/compose

adanet https://github.com/tensorflow/adanet

mljar-supervised https://github.com/mljar/mljar-supervised/ https://www.kdnuggets.com/2021/05/binary-classification-automated-machine-learning.html

ludwig https://github.com/ludwig-ai/ludwig

carefree-learn is a minimal Automatic Machine Learning (AutoML) solution for tabular datasets based on PyTorch https://carefree0910.me/carefree-learn-doc/

autoweka https://github.com/automl/autoweka

ATOM Automated Tool for Optimized Modelling https://github.com/tvdboom/ATOM

autokeras https://autokeras.com/ autoSklearn https://automl.github.io/auto-sklearn/master/

baytune auto-tuning https://github.com/MLBazaar/BTB

storm-tuner Best Hyper Parameters For Deep Learning Model https://github.com/ben-arnao/StoRM

adanet https://github.com/tensorflow/adanet

AlphaPy Automated Machine Learning https://github.com/ScottfreeLLC/AlphaPy

TransmogrifAI https://github.com/salesforce/TransmogrifAI

Hugging Face’s AutoNLP https://www.analyticsvidhya.com/blog/2021/03/a-hands-on-introduction-to-hugging-faces-autonlp-101/

complex Machine Learning model in one line with Libra https://github.com/Palashio/libra

Automated Text Classification with EvalML https://www.kdnuggets.com/2021/04/automated-text-classification-evalml.html

Pywedge A complete package for EDA, Data Preprocessing and Modelling https://towardsdatascience.com/pywedge-a-complete-package-for-eda-data-preprocessing-and-modelling-32171702a1e0

3.awesome-AutoML https://github.com/windmaple/awesome-AutoML , automl-gs github.com/minimaxir/automl-gs

autopandas,Auto-Sklearn,Auto-Pytorch,Auto-ViML,AutoViz,AutoGluon,MLBox,FLAML,EvalML,scikit-optimize,Hyperopt-Sklearn,smac3,alphapy,nni,adanet,ludwig, TPOT,flaml, H2OAutoML ,automl ,LightAutoML,auto keras,MLJAR,PyCaret,Auto-sklearn,SMAC,WALTS

Auto-PyTorch,Keras Tuner,DataRobot, DriverlessAI , MLBox, AutoGluon, autoweka, Amazon Lex,Darwin,AdaNet, Microsoft NNI,GradsFlow,Ludwig,autoai,Get Duet,Qlik AutoML,NeutonAutoML,Clarifai,CreateML,Lobe,ObviouslyAI,RunwayML,neuton automl,TransmogrifAI,Rapid Miner,Dataiku,DataRobot,H2O Driverless,Amazon Lex, BigML,AutoML JADBio,Akkio MLJAR, Tazi.ai,UBER’s Ludwig,ANAI,Google Vizier,Tune,HpBandSter,Hyperopt,Facebook’s HiPlot,Bayesian Optimisation,SmartML,SigOpt,Talos,mlmachine,SHERPA Scikit-Optimize,Microsoft’s NNI,Google’s Vizer,GPyOpt,Hyperopt Metric Optimisation Engine (MOE),Optuna,Ray Tune,Keras Tuner,TransmogrifAI

Automated Tensorflow https://github.com/rafiqhasan/auto-tensorflow

MLBox https://github.com/AxeldeRomblay/MLBox

skycube automl https://skycube.app/

stackml Machine Learning platform in the browser https://stackml.com/

quick_ml https://pypi.org/project/quick-ml/ https://www.quickml.info/

MLJAR https://github.com/mljar/mljar-supervised/ https://towardsdatascience.com/binary-classification-with-automated-machine-learning-1a36e78ba50f

TransmogrifAI https://github.com/salesforce/TransmogrifAI darwin http://drwn.anu.edu.au/

GenoML (AutoML) for Genomics https://genoml.com/ https://github.com/GenoML

baytune https://www.kdnuggets.com/2021/03/automating-machine-learning-model-optimization.html https://github.com/MLBazaar/BTB

adanet https://github.com/tensorflow/adanet

FEDOT Automated modeling and machine learning framework FEDOT https://github.com/nccr-itmo/FEDOT

4.AutoGluon AutoML for Text, Image, and Tabular Data https://analyticsindiamag.com/how-to-automate-machine-learning-tasks-using-autogluon/

AutoGL: The First Ever AutoML Framework for Graph Datasets https://analyticsindiamag.com/meet-autogl-the-first-ever-automl-framework-for-graph-datasets/

Neuton TinyML https://neuton.ai/

  1. auto sklearn,auto keras,auto Tensorflow,autoLightAutoML,xcessiv,kerastuner ,LAMA, NNI, FEDOT (https://github.com/sberbank-ai-lab/LightAutoML)

deephyper Automating Deep Neural Networks https://github.com/deephyper/deephyper

Keras Tuner or storm-tuner - Decide Number of Hidden Layers And Neuron In Neural Network

AutoNeuro https://autoneuro.challenge-ineuron.in/

ATOM https://towardsdatascience.com/atom-a-python-package-for-fast-exploration-of-machine-learning-pipelines-653956a16e7b https://github.com/tvdboom/ATOM

  1. autoviml https://github.com/AutoViML/Auto_ViML https://towardsdatascience.com/autoviml-automating-machine-learning-4792fee6ae1e

    deep_autoviml https://github.com/AutoViML/deep_autoviml

    𝗮𝘂𝘁𝗼𝗺𝗮𝘁𝗲 𝗺𝗼𝘀𝘁 𝗼𝗳 𝘁𝗵𝗲 𝗱𝗮𝘁𝗮 𝘀𝗰𝗶𝗲𝗻𝗰𝗲 https://github.com/Muhammad4hmed/GML

    CodeLess https://pypi.org/project/codeless/ https://github.com/porky5191/codeless_demo_project

    AUTORL: AUTOML FOR RL https://www.automl.org/blog-autorl/

  2. sweetviz (EDA purpose) - https://pypi.org/project/sweetviz/ https://www.kdnuggets.com/2021/03/know-your-data-much-faster-sweetviz-python-library.html

  3. pandasprofiling(display whole EDA) - https://pypi.org/project/pandas-profiling/ https://pandas-profiling.github.io/pandas-profiling/docs/master/rtd/index.html

  4. autokeras,AutoSklearn,Neural Network Intelligence

    FeatureTools automated feature engineering.

    MLBox,Lightwood,mindsdb(machine learning models using SQL queries),mljar-supervised,Ludwig(deep learning models without the need to write code)

    AdaNet is a lightweight TensorFlow-based framework

  5. pycaret- https://pycaret.org/ https://www.kdnuggets.com/2020/08/build-automl-pycaret.html https://www.kdnuggets.com/2020/08/github-best-automl-ever-need.html https://www.kdnuggets.com/2020/07/5-things-pycaret.html

Machine Learning in Power BI using PyCaret https://www.kdnuggets.com/2020/05/machine-learning-power-bi-pycaret.html

https://towardsdatascience.com/build-your-first-anomaly-detector-in-power-bi-using-pycaret-2b41b363244e

https://www.kdnuggets.com/2020/06/deploy-machine-learning-pipeline-cloud-docker.html https://www.kdnuggets.com/2020/08/github-best-automl-ever-need.html

mindsdb Machine Learning in 5 Lines of Code https://mindsdb.com/

automated feature engineering https://github.com/alteryx/featuretools https://towardsdatascience.com/why-automated-feature-engineering-will-change-the-way-you-do-machine-learning-5c15bf188b96

Featuretools https://www.featuretools.com/

Automate your ML Pipelines with EvalML https://analyticsindiamag.com/automate-your-ml-pipelines-with-evalml/

Aethos — A Data Science Library to Automate your Workflow https://towardsdatascience.com/aethos-a-data-science-library-to-automate-workflow-17cd76b073a4

AutoAI — Automating the AI Workflow to Build & Deploy Machine Learning model https://medium.com/geekculture/autoai-automating-the-ai-workflow-to-build-deploy-machine-learning-model-bb2b727cda28

AutoML toolkit https://github.com/microsoft/nni

LightAutoML LAMA - automatic model creation framework https://github.com/sberbank-ai-lab/LightAutoML https://analyticsindiamag.com/hands-on-python-guide-to-lama-an-automatic-ml-model-creation-framework/

LightAutoML https://github.com/sb-ai-lab/LightAutoML

mljar-supervised Automates Machine Learning Pipeline with Feature Engineering and Hyper-Parameters Tuning https://github.com/mljar/mljar-supervised

MLBox is a powerful Automated Machine Learning python library https://github.com/AxeldeRomblay/MLBox

12.Auto_Timeseries by auto_ts

13.AutoNLP_Sentiment_Analysis by autoviml

14.automl lazypredict https://github.com/shankarpandala/lazypredict

AutoML Toolkit for Graph Datasets & Tasks AutoGL(Auto Graph Learning)https://medium.com/syncedreview/tsinghua-university-releases-first-automl-toolkit-for-graph-datasets-tasks-c61ea0261d78

AutoFeat-https://analyticsindiamag.com/guide-to-automatic-feature-engineering-using-autofeat/

15.https://github.com/mstaniak/autoEDA-resources

mito , dtale

bamboolib or pandas-ui or pandas-summary or pandas_visual_analysis or Dtale(get code also) (python package for easy data exploration & transformation)

Automating EDA using Pandas Profiling, streamlit_pandas_profiling,Sweetviz and Autoviz,DataPrep,vaex,Datapane,Sweetviz,pandas_UI,PandasGUI,Datatable,Dora,Pywedge,D-Tale,lux,Dabl,Pretty pandas,data_describe,Sparkora,AWS Glue DataBrew,speedML,edaviz,Altair,voyager,Mito,Facets,KNIME,lux,datatable,Pandas-visual-analysis,ExploriPy,Holoviews,lux,Dataprep,atoti,QuickDA ,panel-highcharts,Know Your Data,Atoti ,ExploriPy,autoplotter,tensorflow data validation,skimpy,Skim,OpenRefine,Visualizer,autoclean,Autoplotter,dataTile,mito,Bamboolib,TensorFlow Data Validation,speedML,edaviz,pandas-summary,ExploriPy,
ipywidgets,ipympl,data_describe,lens,DStack,autoplotter,klib,Datasette,FACETS,TensorFlow Data Validation,Auto Data Exploration and Feature Recommendation Tool,great_expectations,DataProfiler,Datasette,streamlit-aggrid,Quick-EDA,QuickDA,Datatile,Deepnote,PiML,AutoPlotter,Klib,Pivottablejs,Qgrid,facets,Great Expectations,Explainerdashboard,BitRook,AutoPlotter,OmniXAI,tabloo,sidetable,HvPlot,summarytools,fasteda,Rath,Missingno,Sketch,pygwalker,fasteda,Apache Superset,Algorithm-visualizer,perspective,jupyter-datatables,dfgui,AutoProfiler,Datatile,ExploriPy

Three R Libraries for Automated EDA dataMaid,DataExplorer,SmartEDA

fiftyone Highly Interactive Dashboards For Visualizing Datasets and Interpret Model https://towardsdatascience.com/highly-interactive-dashboards-for-visualizing-dataset-and-interpret-model-ce6311ea57ca

interpret Dashboards for Interpreting & Comparing Machine Learning Models https://towardsdatascience.com/dashboards-for-interpreting-comparing-machine-learning-models-ffcfb4c05152

QuickDA https://towardsdatascience.com/save-hours-of-work-doing-a-complete-eda-with-a-few-lines-of-code-45de2e60f257

Dataprep https://towardsdatascience.com/dataprep-eda-accelerate-your-eda-eb845a4088bc https://www.analyticsvidhya.com/blog/2021/05/dataprep-library-perform-eda-faster/

explainerdashboard https://towardsdatascience.com/the-quickest-way-to-build-dashboards-for-machine-learning-models-ec769825070d

Facets https://github.com/PAIR-code/facets https://towardsdatascience.com/visualize-your-data-with-facets-d11b085409bc

pywedge https://github.com/taknev83/pywedge https://towardsdatascience.com/automated-interactive-package-for-eda-modeling-and-hyperparameter-tuning-in-a-few-lines-of-228c561fa63c

Datapane makes it simple to build shareable reports from Python https://github.com/datapane/datapane https://towardsdatascience.com/datapanes-new-features-create-a-beautiful-dashboard-in-python-in-a-few-lines-of-code-a3c44523292b https://towardsdatascience.com/introduction-to-datapane-a-python-library-to-build-interactive-reports-4593fd3cb9c8

lux https://medium.com/swlh/automating-exploratory-data-analysis-part-3-d04352b83072 https://pub.towardsai.net/speed-up-eda-with-the-intelligent-lux-37f96542527b

lux Python API for Intelligent Visual Data Discovery https://github.com/lux-org/lux https://analyticsindiamag.com/python-guide-to-lux-an-interactive-visual-discovery/

Automatic EDA https://thecleverprogrammer.com/2021/02/06/automatic-eda-using-python/

Automated Interactive Package for EDA, Modeling, and Hyperparameter Tuning in a few lines of Python Code https://towardsdatascience.com/automated-interactive-package-for-eda-modeling-and-hyperparameter-tuning-in-a-few-lines-of-228c561fa63c

Arena https://github.com/ModelOriented/Arena

https://github.com/mstaniak/autoEDA-resources https://thecleverprogrammer.com/2021/02/06/automatic-eda-using-python/

ExploriPy import EDA-https://analyticsindiamag.com/hands-on-tutorial-on-exploripy-effortless-target-based-eda-tool/

Lens- Statistical Analysis of Data https://analyticsindiamag.com/hands-on-tutorial-on-lens-python-tool-for-swift-statistical-analysis/

Dashboard in Less Than 10 Lines of Code https://towardsdatascience.com/build-dashboards-in-less-than-10-lines-of-code-835e9abeae4b

Plotly Express Interprete data through interactive visualization https://pub.towardsai.net/matplotlib-is-dead-long-life-to-plotly-express-e1671dce0d18

Rich terminal dashboards https://www.willmcgugan.com/blog/tech/post/building-rich-terminal-dashboards/

Explainable AI dashboards https://github.com/oegedijk/explainerdashboard https://www.youtube.com/watch?v=ZgypAMRcmw8

Machine Learning Model Dashboard https://towardsdatascience.com/machine-learning-model-dashboard-4544daa50848

Creating Automated Python Dashboards using Plotly, Datapane, and GitHub Actions https://towardsdatascience.com/creating-automated-python-dashboards-using-plotly-datapane-and-github-actions-ff8aa8b4e3

atoti Python library to quickly build BI analytics dashboards https://docs.atoti.io/latest/tutorial/tutorial.html

interactive dashboards https://medium.com/analytics-vidhya/explainer-dashboard-build-interactive-dashboards-for-machine-learning-models-fda63e0eab9

MitoSheets https://analyticsindiamag.com/guide-to-mitosheets-harnessing-power-of-spreadsheets-in-python/

Datacleaner-https://analyticsindiamag.com/tutorial-on-datacleaner-python-tool-to-speed-up-data-cleaning-process/

Datacleaner :dora ,Voilà -Jupyter Notebooks quickly into standalone web applications , Plotly Dash - for more advanced and production level dashboards

featurewiz(Select the best features from your data set fast with a single line of code) - https://github.com/AutoViML/featurewiz

explainerdashboard https://medium.com/analytics-vidhya/explainer-dashboard-build-interactive-dashboards-for-machine-learning-models-fda63e0eab9

interpret Dashboards for Interpreting & Comparing Machine Learning Models https://hmix13.medium.com/dashboards-for-interpreting-comparing-machine-learning-models-ffcfb4c05152

https://www.kdnuggets.com/2019/07/10-simple-hacks-speed-data-analysis-python.html

Panel - web apps

Automating report generation with Jupyter Notebooks https://medium.com/applied-data-science/full-stack-data-scientist-5-automating-report-generation-with-jupyter-notebooks-919e32e88d18

10 Useful Jupyter Notebook Extensions for a Data Scientist https://towardsdatascience.com/10-useful-jupyter-notebook-extensions-for-a-data-scientist-bd4cb472c25e

Datapane ( Build Interactive Reports) https://towardsdatascience.com/introduction-to-datapane-a-python-library-to-build-interactive-reports-4593fd3cb9c8 https://www.kdnuggets.com/news/index.html

pomegranate probabilistic modelling in Python https://github.com/jmschrei/pomegranate https://www.kdnuggets.com/2020/12/fast-intuitive-statistical-modeling-pomegranate.html

16.CUPY (array process parallel in gpu) https://pypi.org/project/cupy/

17.Dabl-automate the known 80% of Data Science which is data preprocessing, data cleaning, and feature engineering https://pypi.org/project/dabl/

18.dask (parallel comptataion) https://docs.dask.org/en/latest/ https://medium.com/rapids-ai/reading-larger-than-memory-csvs-with-rapids-and-dask-e6e27dfa6c0f#cid=av01_so-nvsh_en-us

pandarallel https://towardsdatascience.com/make-pandas-run-blazingly-fast-3dbcd621f75b

Dask Dataframe and SQL https://docs.dask.org/en/latest/dataframe-sql.html

Swiftapply  – Automatically efficient pandas apply operations https://www.kdnuggets.com/2018/04/swiftapply-automatically-efficient-pandas-apply-operations.html

Dask CUDA

Numba https://github.com/numba/numba https://www.youtube.com/watch?v=3O-Pvnrbsu0 https://www.analyticsvidhya.com/blog/2021/04/numba-for-data-science-make-your-py-code-run-1000x-faster/

Arrow https://towardsdatascience.com/how-fast-is-reading-parquet-file-with-arrow-vs-csv-with-pandas-2f8095722e94

Cython,Numba,PyPy,ray,loky,Dask,p_tqdm (aka Pathos + tqdm),modin,connectorx,cudf, cuML

Reducing Pandas memory https://pythonspeed.com/articles/pandas-load-less-data/ https://www.youtube.com/watch?v=HNE0qHJ9A9o

Speed up Scikit-Learn Model Training https://www.kdnuggets.com/2021/02/speed-up-scikit-learn-model-training.html

mpire Python package for easy multiprocessing, but faster than multiprocessing https://github.com/Slimmer-AI/mpire

thundergbm Fast GBDTs and Random Forests on GPUs https://github.com/Xtra-Computing/thundergbm

thundersvm https://github.com/Xtra-Computing/thundersvm

NumPy API on TensorFlow https://www.tensorflow.org/guide/tf_numpy https://www.youtube.com/watch?v=mgY46AEXnG0

change to proper dtypes,usecols of required only reduce size

Better Data Storage : CSV,Parquet,fastparquet,Feather,lance,HDF5,Apache Arrow,Lance

pandas chunksize,Pandas vectorization,Numpy Vectorization, multiprocessing,airflow,celery,Modin ,Vaex,ray,Dask,PyPolars,Polars,spark,pyspark,Koalas,Cython , cuML,cuDF,cupy,mars,ray,Caching,rapids,joblib,snorkel,arrow,Pyarrow,Ponder,Apache Arrow,Datatable,Fastparquet,dampr,Data Table ,
pandarallel ,Parallel-Pandas,numba,bolt, numexpr,ipython parallel,Nim,speedML,ConnectorX , apache arrow,jax,Pandas-on-Spark,Terality,swifter,partial_fit(),Numba,numexpr,mtalgDask,PyArrow, and PySpark,Fugue,NumPy vectorization,Pandas vectorization,datatable,RAPIDS,Swifter,taichi,scikit-learn-intelex,𝚏𝚞𝚐𝚞𝚎,bottleneck,Pandarallel,Datatable,Pyspark,Koalas,Cylon,Ibis,pandarallel,Blaze,Odo,multiprocessing,joblib,bottleneck,Mapply,Bottleneck,DuckDB,DataFusion, Blaze,Dremio,DuckDB,dbt,Ponder,Daft
https://www.youtube.com/watch?v=eJyjB3cNIB0&feature=youtu.be

deal with Big Data Optimize dataframes,Use only required columns,Chunking data,Sparse data formats,Better Data file formats(Parquet,Feather,HDF5),Pandas alternates(Modin,vaex,dask,spark),Intel(R) extension for sklearn, Apply Vectorized,Numba,Rapids cuDF

composer library of algorithms to speed up neural network training https://github.com/mosaicml/composer

ColossalAI A Unified Deep Learning System for Large-Scale Parallel Training https://github.com/hpcaitech/ColossalAI

19.dataprep (Understand your data with a few lines of code in seconds)

data-preparation-tools - https://improvado.io/blog/data-preparation-tools

20.Dora library is another data analysis library designed to simplify exploratory data analysis. https://pypi.org/project/Dora/

23.FlashText (A library faster than Regular Expressions for NLP tasks) https://pypi.org/project/flashtext/

24.Guietta (tool that makes simple GUIs simple) https://pypi.org/project/guietta/

pandas-visual-analysis -https://analyticsindiamag.com/hands-on-guide-to-pandas-visual-analysis-way-to-speed-up-data-visualization/

25.hummingbird (make code fastly exexcute) https://pypi.org/project/Hummingbird/ https://analyticsindiamag.com/guide-to-hummingbird-a-microsofts-library-for-expediting-traditional-machine-learning-models/

CUML- increase the speed of training your machine learning model https://towardsdatascience.com/train-your-machine-learning-model-150x-faster-with-cuml-69d0768a047a

https://docs.rapids.ai/api/cuml/stable/

modin https://www.kdnuggets.com/2021/03/speed-up-pandas-modin.html

Datatable speed up pandas https://www.youtube.com/watch?v=mQi6QIGGJ5U

Process large datasets without running out of memory https://pythonspeed.com/memory/?utm_medium=email&utm_source=topic+optin&utm_campaign=awareness&utm_content=20210426+data+ai+nl&mkt_tok=MTA3LUZNUy0wNzAAAAF8rA-uJucI5nYkInNB60OO8SozgyRZZ2ptfW-Dt-5HR3I0ysFHju2OYpeK_JZRtxcnmHGSefwL-1zg9Be3zse6zZVklh3zcWYSCxLRvJqd5LfAJMaF

Snap ML — Speed Up Model Training https://medium.com/ibm-data-ai/snap-ml-speed-up-model-training-2ef36fbbf101

26.memory-profiler (tell memory consumption line by line) https://pypi.org/project/memory-profiler/

Cython A Speed-Up Tool for your Python Function https://towardsdatascience.com/cython-a-speed-up-tool-for-your-python-function-9bab64364bfd

PyPy Run Your Python Code as Fast as C https://towardsdatascience.com/run-your-python-code-as-fast-as-c-4ae49935a826

Python Tricks for Keeping Track of Your Data https://towardsdatascience.com/python-tricks-for-keeping-track-of-your-data-aef3dc817a4e

27.numexpr (incerease speed of execution of numpy) https://github.com/pydata/numexpr

pypolars instead of pandas (beating-pandas-performance) https://www.youtube.com/watch?v=1-O_KnLZEso https://towardsdatascience.com/3x-times-faster-pandas-with-pypolars-7550e605805e

50X speed up your Pandas apply function https://github.com/jmcarpenter2/swifter

sklearn 100x Faster https://www.kdnuggets.com/2019/09/train-sklearn-100x-faster.html

JAX Autograd and XLA, facilitating high-performance machine learning research https://github.com/google/jax

Numba (optimise performance of numpy and high performance python compiler) http://numba.pydata.org/

Pyston project open sources its faster Python https://www.infoworld.com/article/3618169/pyston-project-open-sources-its-faster-python.html

28.pandarallel (simple and efficient tool to parallelize your pandas computation on all your CPUs) https://pypi.org/project/pandarallel/

Pandarallel, Pandarallel’s parallel_apply()

29.PDFTableExtract(by PyPDF2) https://github.com/ashima/pdf-table-extract

Camelot-https://towardsdatascience.com/extracting-tabular-data-from-pdfs-made-easy-with-camelot-80c13967cc88

30.PyImpuyte(Python package that simplifies the task of imputing missing values in big datasets) https://pypi.org/project/PyImpuyte/

31.libra(Automates the end-to-end machine learning process in just one line of code) https://pypi.org/project/libra/

32.debug code by puyton -m pdp -c continue

33.cURL (This is a useful tool for obtaining data from any server via a variety of protocols including HTTP.)
https://stackabuse.com/using-curl-in-python-with-pycurl/

34.csvkit https://pypi.org/project/csvkit/

35.IPython IPython gives access to enhanced interactive python from the shell.

36.pip install faker (Create our own Dataset) https://pypi.org/project/Faker/

37.Python debugger %pdb

38.𝚟𝚘𝚒𝚕𝚊-From notebooks to standalone web applications and dashboards https://voila.readthedocs.io/en/stable/ https://github.com/voila-dashboards/voila

39.𝚝𝚜𝚕𝚎𝚊𝚛𝚗 for timeseries data https://github.com/tslearn-team/tslearn

40.texthero text-based dataset in Pandas Dataframe quickly and effortlessly https://github.com/jbesomi/texthero

41.𝚔𝚊𝚕𝚎𝚒𝚍𝚘(web-based visualization libraries like your Jupyter Notebook with zero dependencies) https://pypi.org/project/kaleido/

42.Vaex- Reading And Processing Huge Datasets in seconds https://github.com/vaexio/vaex

43.Uber’s Ludwig is an Open Source Framework for Low-Code Machine Learning https://eng.uber.com/introducing-ludwig/

44.Google’s TAPAS, a BERT-Based Model for Querying Tables Using Natural Language https://github.com/google-research/tapas

45.RAPIDS open GPU Data Science https://rapids.ai/

RAPIDS cuML,cudf

tick is a lightweight machine learning library https://x-datainitiative.github.io/tick/

modular machine learning framework http://www.pybrain.org/docs/

machine learning framework It supports several programming languages notably: Python, R, Java, Scala, Ruby and Lua Shogun https://github.com/shogun-toolbox/shogun/

46.pyforest Lazy-import of all popular Python Data Science libraries. Stop writing the same imports over and over again. https://pypi.org/project/pyforest/0.1.1/

47.Modin Get faster Pandas with Modin https://github.com/modin-project/modin

48.Text2Code for Jupyter notebook - https://github.com/deepklarity/jupyter-text2code , https://towardsdatascience.com/data-analysis-made-easy-text2code-for-jupyter-notebook-5380e89bb493

49.Openrefine Tool-For Data Preprocessing Without Code https://analyticsindiamag.com/openrefine-tutorial-a-tool-for-data-preprocessing-without-code/

50.Microsoft Releases Latest Version Of DeepSpeed deep learning optimisation library known as DeepSpeed- https://github.com/microsoft/DeepSpeed

https://analyticsindiamag.com/microsoft-releases-latest-version-of-deepspeed-its-python-library-for-deep-learning-optimisation/

51.4-pandas-tricks-https://towardsdatascience.com/4-pandas-tricks-that-most-people-dont-know-86a70a007993

53.autoplotter is a python package for GUI based exploratory data analysis-https://github.com/ersaurabhverma/autoplotter

54.3 NLP Interpretability Tools For Debugging Language Models-https://www.topbots.com/nlp-interpretability-tools/

55.New Algorithm For Training Sparse Neural Networks (RigL)-https://analyticsindiamag.com/rigl-google-algorithm-neural-networks/

56.Read Data from pdf and Word-PyPDF2,PDFMiner,PDFQuery,tabula-py,pdflib for Python,PDFTables,PyFPDF2

OpenCV to Extract Information From Table Images-https://analyticsindiamag.com/how-to-use-opencv-to-extract-information-from-table-images/

57.Text Annotation-https://towardsdatascience.com/tortus-e4002d95134b

58.GDMix, A Framework That Trains Efficient Personalisation Models - https://analyticsindiamag.com/linkedin-open-sources-gdmix-a-framework-that-trains-efficient-personalisation-models/

59.Learn Machine Learning Concepts Interactively-https://towardsdatascience.com/learn-machine-learning-concepts-interactively-6c3f64518da2

60.Folium, Python Library For Geographical Data Visualization-https://analyticsindiamag.com/hands-on-tutorial-on-folium-python-library-for-geographical-data-visualization/

61.GPU Technology Conference (GTC) Keynote Oct 2020-https://www.youtube.com/watch?v=Dw4oet5f0dI&list=PLZHnYvH1qtOYOfzAj7JZFwqtabM5XPku1

62.jiant nlp task-https://github.com/nyu-mll/jiant

63.painted your machine learning model-https://koaning.github.io/human-learn/

64.Vector AI-https://github.com/vector-ai/vectorai

65.NVIDIA NeMo(for Conversational AI)-https://github.com/NVIDIA/NeMo

66.Deep Learning Models Without Coding(DeepCognition)-https://analyticsindiamag.com/how-to-use-deepcognition-to-build-drag-and-drop-deep-learning-models-without-coding/

67.100 Machine Learning Projects-@amankharwal/100-machine-learning-projects-aff22b22dd6e"">https://medium.com/@amankharwal/100-machine-learning-projects-aff22b22dd6e

68.Question generation using Natural Language Processing-https://github.com/ramsrigouthamg/Questgen.ai

69.PixelLib(image segmentation,Blur Background,Gray Background,Background Colour Change,Background Change)-https://github.com/ayoolaolafenwa/PixelLib

70.High-Resolution 3D Human Digitization-https://shunsukesaito.github.io/PIFuHD/

71.AI model that translates 100 languages without relying on English data - https://ai.facebook.com/blog/introducing-many-to-many-multilingual-machine-translation/

72.800 free textbooks - https://open.umn.edu/opentextbooks

73.TensorDash is an application that lets you remotely monitor your deep learning model’s metrics and notifies you when your model training is completed or crashed.

https://github.com/CleanPegasus/TensorDash

HyperDash https://towardsdatascience.com/how-to-monitor-and-log-your-machine-learning-experiment-remotely-with-hyperdash-aa7106b15509

74.YellowBrick -select features, tune hyperparameters, select the best models, and understand the performance metrics.

75.Freely Available Python Books-https://rajukumarmishrablog.com/freely-available-python-books/

Collection of Python Cheat Sheets- https://rajukumarmishrablog.com/collection-of-python-cheat-sheets/

76.Add External Data to Your Pandas Dataframe - https://towardsdatascience.com/add-external-data-to-your-pandas-dataframe-with-a-one-liner-f060f80daaa4

https://www.openblender.io/#/welcome

77.visualize the model architecture-https://github.com/PerceptiLabs/PerceptiLabs

78.Train Conversational AI in 3 lines of code with NeMo and Lightning-https://towardsdatascience.com/train-conversational-ai-in-3-lines-of-code-with-nemo-and-lightning-a6088988ae37

79.Machine Learning for Healthcare by mit-https://ocw.mit.edu/courses/electrical-engineering-and-computer-science/6-s897-machine-learning-for-healthcare-spring-2019/

80.pydot is an interface to Graphviz ,AutoGraph-Easy control flow for graphs,Neo4j-Graph Data Science Library,pyRDF2Vec-Representations of Entities in a Knowledge Graph,igraph,NetworkX,euler,pyvis,NEuler: No-code graph algorithms,dgl ease deep learning on graph,Graph4nlp,Graph-tool,Networkit,Igraph

PyG (PyTorch Geometric) Graph Neural Network Library for PyTorch https://github.com/pyg-team/pytorch_geometric

7 Open Source Libraries for Deep Learning Graphs https://www.kdnuggets.com/2021/07/7-open-source-libraries-deep-learning-graphs.html

GeometricFlux.jl,PyTorch GNN, Jraph,Spektral,Graph Nets,Deep Graph Library , PyTorch Geometric

https://www.tensorflow.org/neural_structured_learning https://github.com/deepmind/graph_nets https://deepmind.com/research/open-source/graph-nets-library

https://www.kdnuggets.com/2019/09/5-graph-algorithms-data-scientists-know.html https://towardsdatascience.com/visualizing-networks-in-python-d70f4cbeb259

Pyviz https://towardsdatascience.com/interactive-network-visualization-757af376621

AutoGL: The First Ever AutoML Framework for Graph Datasets https://analyticsindiamag.com/meet-autogl-the-first-ever-automl-framework-for-graph-datasets/

https://analyticsindiamag.com/complete-guide-to-autogl-the-latest-automl-framework-for-graph-datasets/ http://mn.cs.tsinghua.edu.cn/AutoGL/

Graph Neural Networks, PySpark, Neural Cellular Automata, FB Prophet, Google Cloud and NLP codes https://github.com/RubensZimbres/Repo-2021

AmpliGraph: A Machine Learning Library For Knowledge Graphs https://analyticsindiamag.com/guide-to-ampligraph-a-machine-learning-library-for-knowledge-graphs/

open-source project for analysis of graphs or networks GrasPy / graspologic https://graspy.neurodata.io/

Pykg2vec: A Python Library for Knowledge Graph Embedding https://analyticsindiamag.com/pykg2vec/

https://www.kdnuggets.com/2019/05/60-useful-graph-visualization-libraries.html https://www.kdnuggets.com/2015/06/top-30-social-network-analysis-visualization-tools.html

84.Google Introduces Document AI (DocAI) https://www.marktechpost.com/2020/11/05/google-introduces-document-ai-docai-platform-for-automated-document-processing/

85.100 Machine Learning Projects-https://amankharwal.medium.com/100-machine-learning-projects-aff22b22dd6e

86.https://towardsdatascience.com/25-hot-new-data-tools-and-what-they-dont-do-31bf23bd8e56

87.Opacus: A high-speed library for training PyTorch models-https://ai.facebook.com/blog/introducing-opacus-a-high-speed-library-for-training-pytorch-models-with-differential-privacy

88.lazynlp https://github.com/chiphuyen/lazynlp

90.Pseudo-Labeling (deal with small datasets)https://towardsdatascience.com/pseudo-labeling-to-deal-with-small-datasets-what-why-how-fd6f903213af

91.Project List A - Comparatively Easy Wine Quality Analysis,Boston Housing Prediction,Spam Email Classification,Survival Prediction - Titanic Disaster,Stock Market Prediction
Class of Flower Prediction,Bigmart Sales Prediction,Air Pollution Prediction,IMDB Prediction,Optimizing Product Price,Web Traffic Time Series Forecasting,Insurance Purchase Prediction,Tweet Classification

Project List B - Comparatively Difficult,Domain-Specific Chatbot,Fake News Detection,Human Action Recognition,Video Classification,Driver Drowsiness Detection,Medical Report Gen Using CT Scans,Sign Language Detection,Image Caption Generator,Celebrity Voice Prediction,Speech Emotion Recognition,Job Recommendation System,Interest Level in Rental Properties,Google Ads Keywords Generator

https://www.analyticsvidhya.com/blog/2018/05/24-ultimate-data-science-projects-to-boost-your-knowledge-and-skills/

https://ml-showcase.paperspace.com/ https://github.com/ashishpatel26/500-AI-Machine-learning-Deep-learning-Computer-vision-NLP-Projects-with-code

https://dev.to/hb/30-machine-learning-ai-data-science-project-ideas-gf5 https://www.theinsaneapp.com/2021/01/top-30-ai-and-ml-projects-for-2021.html

https://medium.com/coders-camp/180-data-science-and-machine-learning-projects-with-python-6191bc7b9db9

https://www.analyticsvidhya.com/blog/2020/12/10-data-science-projects-for-beginners/?utm_source=linkedin&utm_medium=AV|link|high-performance-blog|blogs|44195|0.375

https://medium.com/the-innovation/130-machine-learning-projects-solved-and-explained-605d188fb392 https://medium.com/coders-camp/96-python-projects-with-source-code-4069eb58beef

https://thecleverprogrammer.com/machine-learning/ https://www.kdnuggets.com/2020/03/20-machine-learning-datasets-project-ideas.html

https://www.analyticsvidhya.com/blog/2018/05/24-ultimate-data-science-projects-to-boost-your-knowledge-and-skills/?utm_source=linkedin&utm_medium=KJ|link|blackbelt|blogs|44081|0.625

https://www.kdnuggets.com/2021/03/10-amazing-machine-learning-projects-2020.html?utm_content=bufferc38bd&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer

https://data-flair.training/blogs/machine-learning-datasets/# https://data-flair.training/blogs/machine-learning-project-ideas/

https://data-flair.training/blogs/artificial-intelligence-ai-tutorial/ https://www.theinsaneapp.com/2020/11/data-science-projects-with-source-code.html

https://data-flair.training/blogs/cartoonify-image-opencv-python/ https://data-flair.training/blogs/python-project-calorie-calculator-django/

https://www.theinsaneapp.com/2020/11/machine-learning-projects-with-source-codes.html https://www.theinsaneapp.com/2020/11/data-science-projects-with-source-code.html

https://amankharwal.medium.com/20-machine-learning-projects-on-future-prediction-with-python-93932d9a7f7f

https://medium.com/coders-camp/20-deep-learning-projects-with-python-3c56f7e6a721 https://amankharwal.medium.com/12-machine-learning-projects-on-object-detection-46b32adc3c37

https://amankharwal.medium.com/7-python-gui-projects-for-beginners-87ae2c695d78 https://github.com/Kushal997-das/Project-Guidance

https://amankharwal.medium.com/20-machine-learning-projects-for-portfolio-81e3dbd167b1 https://amankharwal.medium.com/4-chatbot-projects-with-python-5b32fd84af37

https://amankharwal.medium.com/30-python-projects-solved-and-explained-563fd7473003

https://www.aiquotient.app/projects https://www.aiquotient.app/ https://www.mltut.com/best-machine-learning-projects-for-beginners/

https://medium.com/coders-camp/20-machine-learning-projects-on-nlp-582effe73b9c

93.The Linux Command Handbook-https://www.freecodecamp.org/news/the-linux-commands-handbook/

94.130 Machine Learning Projects Solved and Explained-https://medium.com/the-innovation/130-machine-learning-projects-solved-and-explained-605d188fb392

95.DataBrew-do drag-and-drop data cleansing

96.stratascratch- https://www.stratascratch.com/

97.5 ways to celebrate TensorFlow’s 5th birthday-https://blog.google/technology/ai/5-ways-celebrate-tensorflows-5th-birthday/

98.TensorFlow.js: Machine Learning in Javascript https://blog.tensorflow.org/2018/03/introducing-tensorflowjs-machine-learning-javascript.html

99.Language Interpretability Tool open-source platform for visualization and understanding of NLP models - https://pair-code.github.io/lit/

100.Deep Learning Hardware Guide https://towardsdatascience.com/another-deep-learning-hardware-guide-73a4c35d3e86

101.johnsnowlabs- https://nlp.johnsnowlabs.com/ https://nlp.johnsnowlabs.com/docs/en/quickstart https://nlp.johnsnowlabs.com/docs/en/licensed_release_notes

104.Clarifai-https://www.clarifai.com/ https://analyticsindiamag.com/clarifai/

105.rapidly build and deploy machine learning models https://analyticsindiamag.com/top-10-datarobot-alternatives-one-must-know/

106.Hive Data full-stack AI https://thehive.ai/hive-data

107.real-time remote service to get the Keras callbacks to the telegram including the details of metrics https://github.com/ksdkamesh99/TensorGram

108.Language Interpretability Tool - https://pair-code.github.io/lit/demos/

109.Docly will handle the comments http://thedocly.io/

110.machine-learning-roadmap-2020 https://whimsical.com/machine-learning-roadmap-2020-CA7f3ykvXpnJ9Az32vYXva

112.freecodecamp - https://www.freecodecamp.org/learn

113.image_to_string (pytesseract)

Extract Tables in PDFs to pandas DataFrames - tabula-py

114.NLP Pipelines in a single line of code https://medium.com/analytics-vidhya/nlp-pipelines-in-a-single-line-of-code-500b3266ac7b

116.aitextgen #for ai text generation

117.http://introtodeeplearning.com/ http://cs231n.stanford.edu/ http://web.stanford.edu/class/cs224n/index.html#schedule https://www.youtube.com/playlist?list=PLkFD6_40KJIwhWJpGazJ9VSj9CFMkb79A https://www.youtube.com/playlist?list=PLkFD6_40KJIwhWJpGazJ9VSj9CFMkb79A https://www.youtube.com/playlist?list=PLwRJQ4m4UJjPiJP3691u-qWwPGVKzSlNP https://www.youtube.com/playlist?list=PLoROMvodv4rMC6zfYmnD7UG3LVvwaITY5

117.https://data-flair.training/blogs/data-science-tutorials-home

119.Pystiche - Create Your Artistic Image Using Pystiche https://analyticsindiamag.com/pystiche/ https://pystiche.readthedocs.io/en/latest/index.html

120.Low Light Image Enhancement using Python & Deep Learning https://github.com/soumik12345/MIRNet/ https://www.youtube.com/watch?v=b5Uz_c0JLMs

121.Koalas on Apache Spark - Pandas API https://www.youtube.com/watch?v=kOtAMiMe1JY&t=482s https://koalas.readthedocs.io/en/latest/

122.DALL·E https://openai.com/blog/dall-e/ https://analyticsindiamag.com/comprehensive-guide-to-dall-e-by-openai-creating-images-from-text/

https://github.com/lucidrains/big-sleep https://github.com/lucidrains/deep-daze https://www.youtube.com/watch?v=lVR5kN7SjQ8&feature=youtu.be

DALL·E Mini,GPT-3,Dalle-2,Dalle-3,Imagen,RE-IMAGEN,Parti,Midjourney,Craiyon,Make-A-Scene,Imagen,DALL-E,Imagen, NUWA-Infinity,Make a Scene,Cogview 2,VQGAN,VQGAN-Clip,Latent-Diffusion,Parti,MidJourney,Ultraleap’s Midjourney, Hugging Face’s Craiyon, Meta’s Make-A-Scene and Google’s Imagen,CogVideo,Big Sleep,Disco,Stable Diffusion,fast-stable-diffusion,DreamStudio,CodeFormer,DreamBooth,Tiktok’s Greenscreen,textual_inversion,GauGAN2,Stable-Craiyon,Disco Diffusion,DreamBooth,AI Greenscreen,Wonder,Nightcafe,Midjourney, craiyon,loab,Starry AI,Dream By,Wombo,Nightcafe,Pixray,Deep Dream,Stable Diffusion,DreamFusion,Make-A-Video,Imagen Video,Midjourney,CogVideo,ERNIE-ViLG 2.0,eDiffi,pixray,starryai,promptoMANIA,starry.ai,NightCafe,Artbreeder,wombo.ai,Muse,BlueWillow,StyleGAN-T,GigaGAN,DeepFloyd IF, Bing Image Creator,Craiyon,InstantArt,Pixray,Blue Willow,Playground AI,Picsart,Perfusion AI,XGen-Image,Ideogram AI,DeciDiffusion,lexica

https://pharmapsychotic.com/tools.html https://airtable.com/shrDxAxCCxAZVtMnt/tbl3FzgFjvvuYZMm9 https://www.marktechpost.com/2022/10/05/top-artificial-intelligence-ai-based-text-to-image-generators/

text to video,images,audio,3D: Adobe firefly,NVIDIA Picasso,Runway

text to video : CogVideo,Make-A-Video,Phenaki,Imagen Video,DreamFusion,Phenak,CogVideo,GODIVA,NÜWA,Google UniTune (fine-tuned Imagen),Synthesia,Lumen5,Flixclip,Elai,Veed.io,Kaiber,Genmo,LeiaPix,Glia Cloud,Stable Diffusion Videos,Synthesia,InVideo,Lumen5,Designs.ai,Pictory,Wisecut,Veed.io,Fliki,Shap-e,dalle,pointe,AdaMPI,AudioGen

3D Models from Text : DreamFusion,CLIP-Mesh,Point-E,Magic3D,Text2Mesh,CLIP-Mesh,Neuralangelo

Text-to-Audio : Audiogen,diffsound,GliaCloud,Synthesia,InVideo,Synths Video,VEED.IO,Lumen5,Pictory,Designs.ai,Wisecut,Replica,Speechify,Murf,Play.ht,Lovo.ai,VALL-E,VALL-E X,MusicLM, SingSong, Moûsai 2, AudioLDM, and EPIC-SOUND,Audio-LDM

Top 12 AI Music Generators :MusicLM – Google’s Text to Music Generator,Soundraw.io,Amper Music,AIVA,Humtap,Amadeus Code,Computoser,Google’s Magenta ,Chrome’s Song Maker,Generative.FM,MuseNet

Text-to-Motion : MotionCLIP,Language2Pose

Text-to-PowerPoint : ChatBCG

Mubert Text to Music https://github.com/MubertAI/Mubert-Text-to-Music ,MusicLM,MusicGen

Music generator AIVA,Amper AI,Jukebox,Soundraw,Evoke, AudioML,EnCodec

Text generators Frase Io,Peppertype,Rytr,Jasper,Copy.ai,ChatGPT

Beginner’s Guide to the CLIP Model https://www.kdnuggets.com/2021/03/beginners-guide-clip-model.html https://www.kdnuggets.com/2021/03/multilingual-clip--huggingface-pytorch-lightning.html

StyleCLIP: Text Driven Image Manipulation https://analyticsindiamag.com/guide-to-styleclip-text-driven-image-manipulation/

https://sachinruk.github.io/blog/pytorch/pytorch%20lightning/loss%20function/gpu/2021/03/07/CLIP.html

123.SpeechBrain https://speechbrain.github.io/

124.Real-Time High-Resolution Background Replacement https://analyticsindiamag.com/introducing-real-time-high-resolution-background-replacement/ https://github.com/PeterL1n/BackgroundMattingV2

125.greppo Build & deploy geospatial applications quick and easy. https://github.com/greppo-io/greppo

126.Online tools to create mind-blowing AI art https://analyticsindiamag.com/online-tools-to-create-mind-blowing-ai-art/

If you like my work. please buy me a coffee it motivate me -> https://www.buymeacoffee.com/achuthasubhash?new=1

  1. HAPPY LEARNING
#mlcheatsheet#_1647584821844.pdf
100 Excel Tips_1647584822050.pdf
20 Python Libraries You_1647584822375.pdf
AI, XAI, AI-Ethics,_1647584822795.pdf
AI-ML-DS-Statistics-Math_Tutorials-Books-Videos_Podcasts_Websites_1647584823042.pdf
Andrew Ng, Machine Learning Yearning_1647584823361.pdf
Basic Git Commands_1647584823697.pdf
CNN Cheatsheet_1647584823802.pdf
CONVOLUTIONAL NEURAL NETWORKS_1647584824065.pdf
Calculus Cheatsheet_1647584824148.pdf
Collection of Google Colab NBs_1647584824378.pdf
Common Machine Learning Algorithms_1647584826223.pdf
Data Engineering_1647584827111.pdf
Data Mining Process, Techniques, Tools_1647584827415.pdf
Data Science Cheatsheet_1647584828098.pdf
Data Science Resources_1647584828217.pdf
Data Science from scratch_1647584829147.pdf
Data Visualization_1647584830855.pdf
DataAnalytics_1647584831690.pdf
Deep Learning Tips and Tricks_1647584831972.pdf
Docker Cheat Sheet_1647584832121.pdf
Evaluation of Clustering_1647584832308.pdf
Excel Shortcuts_1647584832468.pdf
Flask_1647584832481.pdf
Introducing PyCaret_1647584833010.pdf
Learning Python From Zero to hero_1647584833461.pdf
Learnmachinelearning_1647584834164.pdf
Linear Algebra and Calculus_1647584834341.pdf
Linux Command Line Cheat Sheet_1647584834936.pdf
ML advancements_1647584835007.pdf
Machine Learning Tips and Tricks_1647584835531.pdf
Mathematics for Machine Learning short notes_1647584836028.pdf
PySpark_SQL_Cheat_Sheet_Python_1647584836427.pdf
Python Cheatsheet_1647584837525.pdf
Pytorch_1647584837765.pdf
Quick Reference Sheet - ML , DL , AI_1647584840787.pdf
R Cheat Sheets_1647584845329.pdf
R Programming Cheat Sheet_1647584846203.pdf
R Quick Reference_1647584846564.pdf
R_1647584847785.pdf
RNN Cheatsheet_1647584848022.pdf
Reinforcement Learning_1647584848124.pdf
Rules of Machine Learning_1647584848304.pdf
SQL2_1647584848314.pdf
Scikit_Learn_Cheat_Sheet_Python_1647584848398.pdf
Statistics Cheatsheet_1647584848542.pdf
StatisticsWithJuliaDRAFT_1647584849124.pdf
Stats_1647584850067.pdf
Time Series Forecasting Models in Python_1647584850077.pdf
Top Computer Vision Google Colab Notebooks_1647584850144.pdf
Top Interview Questions_1647584850268.pdf
Unix,Linux Command Reference_1647584850303.pdf
Visualization of higher dim data_1647584850470.pdf
Which chart or graph_1647584850787.pdf
all visuallation_1647584850983.pdf
andrew ng deeplearning_1647584851448.pdf
data collection_1647584852603.pdf
data science interview questions by steve_1647584852871.pdf
datacleaning_1647584853489.pdf
git-cheat-sheet_1647584853601.pdf
hadoop-hdfs-commands-cheatsheet_1647584853655.pdf
linear_algebra_in_4_pages_1647584853777.pdf
pandas_1647584853881.pdf
pandas_cheat_sheet_1647584854262.pdf
probability_cheatsheet_1647584854301.pdf
python_1647584854398.pdf
spacy_1647584854528.pdf
stanford university_1647584855032.pdf
stats_handout_1647584855533.pdf
streamlit cheat sheet v1.0_1647584855694.pdf
➡️150⬅️ Machine Learning Formulas_1647584856072.pdf
pandas_1647584853881.pdf
pandas_cheat_sheet_1647584854262.pdf
probability_cheatsheet_1647584854301.pdf
python_1647584854398.pdf
Basic Git Commands_1647584823697.pdf
CNN Cheatsheet_1647584823802.pdf
CONVOLUTIONAL NEURAL NETWORKS_1647584824065.pdf
Calculus Cheatsheet_1647584824148.pdf
Collection of Google Colab NBs_1647584824378.pdf
Common Machine Learning Algorithms_1647584826223.pdf
Data Engineering_1647584827111.pdf
Data Mining Process, Techniques, Tools_1647584827415.pdf
Data Science Cheatsheet_1647584828098.pdf
Data Science Resources_1647584828217.pdf
Data Science from scratch_1647584829147.pdf
Data Visualization_1647584830855.pdf
DataAnalytics_1647584831690.pdf
Deep Learning Tips and Tricks_1647584831972.pdf
Docker Cheat Sheet_1647584832121.pdf
Evaluation of Clustering_1647584832308.pdf
Excel Shortcuts_1647584832468.pdf
Flask_1647584832481.pdf
Introducing PyCaret_1647584833010.pdf
Learning Python From Zero to hero_1647584833461.pdf
Learnmachinelearning_1647584834164.pdf
Linear Algebra and Calculus_1647584834341.pdf
Linux Command Line Cheat Sheet_1647584834936.pdf
ML advancements_1647584835007.pdf
Machine Learning Tips and Tricks_1647584835531.pdf