processed #processing #toprocess #important #short #long #casual #focus¶

Tags¶

Metadata: #topic
Part of: Science Engineering Computer science Technology Natural science
Related:
Includes:
Additional:

Significance¶

Intuitive summaries¶

Definitions¶

A system that is intelligent and constructed by humans.
A branch of computer science which develops and studies intelligent machines.
Artificial intelligence (AI), in its broadest sense, is intelligence exhibited by machines, particularly computer systems. It is a field of research in computer science that develops and studies methods and software which enable machines to perceive their environment and uses learning and intelligence to take actions that maximize their chances of achieving defined [[goal|goals]].

Technical summaries¶

Main resources¶

Artificial intelligence - Wikipedia

Landscapes¶

By domain
- Generative artificial intelligence
- Robotics
By application
- Artificial intelligence x Healthcare
- [[AI in education]]
By approach
By skill:
Outline of artificial intelligence - Wikipedia
- - Algorithms and techniques
    - [[Search algorithm]]
    - [[Optimization search]]
    - Logic
    - [[Probabilistic methods for uncertain reasoning]]
      - [[Bayesian network]]
      - [[Bayesian inference]]
    - [[Classification]]
    - [[Artificial neural networks]]
    - Robotics
    - [[Neuromorphic engineering]]
    - [[Cognitive architecture]]
    - [[Multiagent system]]
  - Applications
    - Reasoning and problem solving
      - [[Automating science]]
      - [[Expert system]]
    - [[Knowledge representation]]
    - [[Planning]]
    - [[Learning]]
      - Machine learning
    - Natural language processing
    - [[Image generation]]
    - [[Audio generation]]
    - [[Video generation]]
    - [[Perception]]
    - Robotics
    - Control
    - [[Social intelligence]]
    - [[Game playing]]
    - [[Computational creativity]]
    - [[Personal assistant]]
Map of Artificial Intelligence - YouTube

GitHub - dair-ai/ML-YouTube-Courses: 📺 Discover the latest machine learning / AI courses on YouTube.
Applications of artificial intelligence - Wikipedia
Phenomena:
- Consciousness
  - Artificial consciousness
Related fields:
[[Selfreplicating machines]]
Singularity
- [[Recursive self improvement]]
- [[Intelligence explosion]]
- [[Hive mind]]
- [[Robot swam]]
- [[Transhumanism]]
Risks of artificial intelligence
- Artificial intelligence safety
Theory
Intelligence#Definitions
- [[Artificial Intelligence x Biological Intelligence x Collective Intelligence]]
- Generalization
  - Artificial Intelligence x Generalization Let's make a benchmark testing for AI systems that can nicely do causal modeling, strong generalization, continuous learning, data & compute efficiency and stability/reliability in symbolic reasoning, agency, more complex tasks across time and space, long term planning, optimal bayesian inference etc. The ultimate benchmark would be giving Ai systems all the information that Newton, Maxwell, Boltzman, Einstein, Feynman, Edward Witten, Von Neumann etc. had before their discoveries in physics or other fields and then seeing if the system could come up with the same or isomorphic discoveries.

Crossovers¶

[[Artificial Intelligence x Biological Intelligence x Collective Intelligence]]
Artificial Intelligence x Generalization
Artificial intelligence x Healthcare
Artificial intelligence x Science
Artificial intelligence x Finance

Resources¶

Deep dives¶

Theory of Everything in Intelligence

State of the art¶

AI Index Report 2024 – Artificial Intelligence Index Top 10 Takeaways:
AI beats humans on some tasks, but not on all. AI has surpassed human performance on several benchmarks, including some in image classification, visual reasoning, and English understanding. Yet it trails behind on more complex tasks like competition-level mathematics, visual commonsense reasoning and planning.
Industry continues to dominate frontier AI research. In 2023, industry produced 51 notable machine learning models, while academia contributed only 15. There were also 21 notable models resulting from industry-academia collaborations in 2023, a new high.
Frontier models get way more expensive. According to AI Index estimates, the training costs of state-of-the-art AI models have reached unprecedented levels. For example, OpenAI’s GPT-4 used an estimated $78 million worth of compute to train, while Google’s Gemini Ultra cost $191 million for compute.
The United States leads China, the EU, and the U.K. as the leading source of top AI models. In 2023, 61 notable AI models originated from U.S.-based institutions, far outpacing the European Union’s 21 and China’s 15.
Robust and standardized evaluations for LLM responsibility are seriously lacking. New research from the AI Index reveals a significant lack of standardization in responsible AI reporting. Leading developers, including OpenAI, Google, and Anthropic, primarily test their models against different responsible AI benchmarks. This practice complicates efforts to systematically compare the risks and limitations of top AI models.
Generative AI investment skyrockets. Despite a decline in overall AI private investment last year, funding for generative AI surged, nearly octupling from 2022 to reach $25.2 billion. Major players in the generative AI space, including OpenAI, Anthropic, Hugging Face, and Inflection, reported substantial fundraising rounds.
The data is in: AI makes workers more productive and leads to higher quality work. In 2023, several studies assessed AI’s impact on labor, suggesting that AI enables workers to complete tasks more quickly and to improve the quality of their output. These studies also demonstrated AI’s potential to bridge the skill gap between low- and high-skilled workers. Still, other studies caution that using AI without proper oversight can lead to diminished performance.
Scientific progress accelerates even further, thanks to AI. In 2022, AI began to advance scientific discovery. 2023, however, saw the launch of even more significant science-related AI applications— from AlphaDev, which makes algorithmic sorting more efficient, to GNoME, which facilitates the process of materials discovery.
The number of AI regulations in the United States sharply increases. The number of AIrelated regulations in the U.S. has risen significantly in the past year and over the last five years. In 2023, there were 25 AI-related regulations, up from just one in 2016. Last year alone, the total number of AI-related regulations grew by 56.3%.
People across the globe are more cognizant of AI’s potential impact—and more nervous. A survey from Ipsos shows that, over the last year, the proportion of those who think AI will dramatically affect their lives in the next three to five years has increased from 60% to 66%. Moreover, 52% express nervousness toward AI products and services, marking a 13 percentage point rise from 2022. In America, Pew data suggests that 52% of Americans report feeling more concerned than excited about AI, rising from 37% in 2022.

Brain storming¶

"Using variations of gradient descent to minimize loss function over predicting tons of text data with attention! (now often with better routing of information thanks to mixture of experts architecture)" based tons of inscrutable matrices s triliardama emergentníma vzorama v dynamice kterým matematicky rozumíme nedostatečně. Co se uvnitř děje a ovládatelnost řeší celej mechanistic interpretability obor GitHub - JShollaj/awesome-llm-interpretability: A curated list of Large Language Model (LLM) Interpretability resources.,
nebo Statistical learning theory a deep learning theory obor, [2106.10165] The Principles of Deep Learning Theory,
nebo jiný alignment a empirický alchemistický metody [2309.15025] Large Language Model Alignment: A Survey

And humans may be podobně reduktivně "just doplňovač predikcí input signálů that are compared to actual signals" (using a version of bayesian inference) Predictive coding nebo "jenom bioeletrika a biochemie" nebo "jenom částice"

Teď je asi největší limitace do AI systémů nacpat víc complex systematic coherent reasoning, planning, generalizing, agentnost (autonomita), memory, factual groundedness, online learning, human like etický uvažování, ovládatelnost, což mají celkem weak na složitější tasky když se škálují jejich capabilities, ale děláme v tom progress, buď přes composing LLMs v multiagent systémech, škálování, kvalitnější data a trénování, šťourání jak uvnitř fungujou a tak je ovládat, přes lepší matematický modely jak learning funguje, nebo poupravenou nebo překopanou architekturu, apod.... a nebo se ještě řeší fungující robotika... a všechny top AI laby na těchto věcech v různých mírách pracují/investují. Tady jsou nějaký práce:

Survey of LLMs: [2312.03863] Efficient Large Language Models: A Survey, [2311.10215] Predictive Minds: LLMs As Atypical Active Inference Agents, A Systematic Survey of Prompt Engineering in Large Language Models: Techniques and Applications

Reasoning: Human-like systematic generalization through a meta-learning neural network | Nature, [2305.20050] Let's Verify Step by Step, [2302.00923] Multimodal Chain-of-Thought Reasoning in Language Models, [2310.09158] Learning To Teach Large Language Models Logical Reasoning, [2303.09014] ART: Automatic multi-step reasoning and tool-use for large language models, AlphaGeometry: An Olympiad-level AI system for geometry - Google DeepMind

Robotics: Mobile ALOHA - A Smart Home Robot - Compilation of Autonomous Skills - YouTube, Eureka! Extreme Robot Dexterity with LLMs | NVIDIA Research Paper - YouTube, Shaping the future of advanced robotics - Google DeepMind, Optimus - Gen 2 - YouTube, Atlas Struts - YouTube, Figure Status Update - AI Trained Coffee Demo - YouTube, Curiosity-Driven Learning of Joint Locomotion and Manipulation Tasks - YouTube

Multiagent systems: [2402.01680] Large Language Model based Multi-Agents: A Survey of Progress and Challenges

Pozměněný/alternativní architektury: Mamba (deep learning architecture) - Wikipedia, [2305.13048] RWKV: Reinventing RNNs for the Transformer Era, V-JEPA: The next step toward advanced machine intelligence, Active Inference

Agency: [2305.16291] Voyager: An Open-Ended Embodied Agent with Large Language Models, [2309.07864] The Rise and Potential of Large Language Model Based Agents: A Survey, Agents | Langchain, GitHub - THUDM/AgentBench: A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24), [2401.12917] Active Inference as a Model of Agency, CAN AI THINK ON ITS OWN? - YouTube, Artificial Curiosity Since 1990

Factual groundedness: [2312.10997] Retrieval-Augmented Generation for Large Language Models: A Survey, Perplexity, ChatGPT - Consensus

Memory:
větší context window Gemini 10 mil token context window, nebo vektorový databáze

Online learning: <https://en.wikipedia.org/wiki/Online_machine_learning>

Meta learning: https://en.wikipedia.org/wiki/Meta-learning_(computer_science)

Planning: [2402.01817] LLMs Can't Plan, But Can Help Planning in LLM-Modulo Frameworks, [2401.11708v1] Mastering Text-to-Image Diffusion: Recaptioning, Planning, and Generating with Multimodal LLMs, [2305.16151] Understanding the Capabilities of Large Language Models for Automated Planning

Generalizing: [2402.10891] Instruction Diversity Drives Generalization To Unseen Tasks, Automated discovery of algorithms from data | Nature Computational Science, [2402.09371] Transformers Can Achieve Length Generalization But Not Robustly, [2310.16028] What Algorithms can Transformers Learn? A Study in Length Generalization, [2307.04721] Large Language Models as General Pattern Machines, A Tutorial on Domain Generalization | Proceedings of the Sixteenth ACM International Conference on Web Search and Data Mining, [2311.06545] Understanding Generalization via Set Theory, [2310.08661] Counting and Algorithmic Generalization with Transformers, Neural Networks on the Brink of Universal Prediction with DeepMind’s Cutting-Edge Approach | Synced, [2401.14953] Learning Universal Predictors, Spectral bias and task-model alignment explain generalization in kernel regression and infinitely wide neural networks | Nature Communications

Je dost možný (a velký % researchers si myslí) že research snažící se ovládat tyhle šílený inscrutabe matrices nemá dostatečně rychlej development v porovnávání s capabilities researchem (zvětšování množství věcí čeho ty systémy jsou schopny)

Potom nemáme tušení jak vypnout behaviors s dosavadníma metodama Sleeper Agents: Training Deceptive LLMs that Persist Through Safety Training Anthropic, což teď poslední dny šlo vidět s tím jak GPT4 začlo outputovat totální chaos po updatu OpenAI’s ChatGPT Went Completely Off the Rails for Hours, Gemini bylo víc woke než bylo intended, nebo každou chvíl vidím novej jailbreak co obchází zábrany :PepeLaugh~1: [2307.15043] Universal and Transferable Adversarial Attacks on Aligned Language Models.

Co se týče definicí AGI, tohle je dobrý od DeepMindu Levels of AGI: Operationalizing Progress on the Path to AGI, nebo ještě mám rád i když celkem vague tak celkem dobrou definici od OpenAI: Highly autonomous systems that outperform humans at most economically valuable work, nebo tohle je fajn thread různých definicí a jejich výhod a nevýhod 9 definitions of Artificial General Intelligence (AGI) and why they are flawed, nebo ještě Universal Intelligence: A Definition of Machine Intelligence, nebo Karl Friston má dobrý definice KARL FRISTON - INTELLIGENCE 3.0)

Pokud chcete jiný zdroj predikcí o AI od dalších AI lidí než od Metaculusu, LessWrong/EA, Singularity komunity a různých vlivných v top AI labech, co mají hodně short timelines v příštích 10 letech. Singularity Predictions 2024 by some people big in the field, Metaculus: When will the first weakly general AI system be devised, tested, and publicly announced?

Tak existují ještě tyhle priority a predikce, jejíž intervaly se v těchto questionares každým rokem ~dva krát zmenšují:
AI experts make predictions for 2040. I was a little surprised. | Science News, Thousands of AI Authors on the Future of AI:
"In the largest survey of its kind, 2,778 researchers who had published in top-tier artificial intelligence (AI) venues gave predictions on the pace of AI progress and the nature and impacts of advanced AI systems The aggregate forecasts give at least a 50% chance of AI systems achieving several milestones by 2028, including autonomously constructing a payment processing site from scratch, creating a song indistinguishable from a new song by a popular musician, and autonomously downloading and fine-tuning a large language model. If science continues undisrupted, the chance of unaided machines outperforming humans in every possible task was estimated at 10% by 2027, and 50% by 2047. The latter estimate is 13 years earlier than that reached in a similar survey we conducted only one year earlier [Grace et al., 2022]. However, the chance of all human occupations becoming fully automatable was forecast to reach 10% by 2037, and 50% as late as 2116 (compared to 2164 in the 2022 survey).
Most respondents expressed substantial uncertainty about the long-term value of AI progress: While 68.3% thought good outcomes from superhuman AI are more likely than bad, of these net optimists 48% gave at least a 5% chance of extremely bad outcomes such as human extinction, and 59% of net pessimists gave 5% or more to extremely good outcomes. Between 38% and 51% of respondents gave at least a 10% chance to advanced AI leading to outcomes as bad as human extinction. More than half suggested that "substantial" or "extreme" concern is warranted about six different AI-related scenarios, including misinformation, authoritarian control, and inequality. There was disagreement about whether faster or slower AI progress would be better for the future of humanity. However, there was broad agreement that research aimed at minimizing potential risks from AI systems ought to be prioritized more."

Additional resources¶

AI¶

Artificial Intelligence (AI) is a rapidly evolving field with numerous branches and sub-disciplines. Here's a comprehensive list of various branches of AI:

1. Machine Learning¶

Supervised Learning
Unsupervised Learning
Reinforcement Learning
Deep Learning
Neural Networks
Decision Trees
Support Vector Machines
Ensemble Methods
Clustering
Feature Engineering
Dimensionality Reduction
Model Selection and Training
Transfer Learning
Federated Learning

2. Natural Language Processing (NLP)¶

Speech Recognition
Text-to-Speech
Sentiment Analysis
Machine Translation
Word Embeddings
Named Entity Recognition
Part-of-Speech Tagging
Language Modeling
Text Summarization
Dialog Systems and Chatbots
Question Answering Systems
Natural Language Understanding
Natural Language Generation

3. Computer Vision¶

Image Recognition and Classification
Object Detection
Face Recognition
Optical Character Recognition (OCR)
Image Segmentation
Pattern Recognition
Motion Analysis and Tracking
Scene Reconstruction
Image Enhancement
3D Vision
Augmented Reality

4. Robotics¶

Robotic Process Automation (RPA)
Humanoid Robots
Autonomous Vehicles
Drone Robotics
Industrial Robotics
Swarm Robotics
Soft Robotics
Rehabilitation Robotics
Robotic Surgery
Human-Robot Interaction

5. Knowledge Representation and Reasoning¶

Expert Systems
Ontologies
Semantic Networks
Fuzzy Logic Systems
Rule-Based Systems
Commonsense Reasoning
Case-Based Reasoning
Qualitative Reasoning
Deductive Reasoning

6. Planning and Scheduling¶

Automatic Planning
Decision Support Systems
Multi-agent Systems
Game Theory
Constraint Satisfaction
Resource Allocation
Workflow Management

7. Search and Optimization¶

Genetic Algorithms
Evolutionary Computing
Swarm Intelligence
Simulated Annealing
Hill Climbing
Pathfinding Algorithms
Particle Swarm Optimization

8. Artificial Neural Networks¶

Convolutional Neural Networks (CNN)
Recurrent Neural Networks (RNN)
Long Short-Term Memory Networks (LSTM)
Generative Adversarial Networks (GAN)
Deep Belief Networks
Autoencoders
Radial Basis Function Networks
Transformers
Linear-Time Sequence Modeling with Selective State Spaces

9. Data Mining and Big Data¶

Predictive Analytics
Data Warehousing
Big Data Analytics
Data Visualization
Association Rule Learning
Anomaly Detection

10. Affective Computing¶

Emotion Recognition
Affective Interfaces
Emotional AI
Human Affective Response Analysis

11. AI Ethics and Safety¶

Explainable AI
Fairness and Bias in AI
AI Governance
Privacy-Preserving AI
AI Safety and Robustness
Trustworthy AI

12. Cognitive Computing¶

Cognitive Modeling
Human-Centered AI
Neuromorphic Computing
Cognitive Robotics
Hybrid Intelligent Systems

13. AI in Healthcare¶

Medical Image Analysis
Predictive Diagnostics
Drug Discovery
Personalized Medicine
Patient Data Analysis

14. AI in Business¶

Customer Relationship Management
Business Intelligence
Market Analysis
Supply Chain Optimization
AI in Finance and Trading

15. AI in Education¶

Adaptive Learning Systems
Educational Data Mining
AI Tutors
Learning Analytics
Curriculum Design

16. Quantum AI¶

Quantum Machine Learning
Quantum Computing for AI
Quantum Optimization

I want to learn and have a map of as much of the mathematical theory and practice of as much methods as possible used everywhere for example in:
- statistical methods (frequentist, bayesian statistics,...)
- machine learning (supervised learning (classification, all sorts of regression), unsupervised learning (clusering, dimensionality reduction,...), semisupervised learning, reinforcement learning, ensemble methods,...)
- deep learning (all variations and combinations of classic neural nets, convolutional NNs, recurrent NNs, LSTMs, GANs, selforganizing maps, deep belief networks, deep RL, graph NNs, neural turing machines, all variations of transformers, rwkw, xLSTM, diffusion,...)
- symbolic methods, neurosymbolics, statespace models, graph analysis, other stuff in natural language processing, computer vision, signal processing, anomaly detection, recommender systems, different optimization algorithms and metaheuristics, metalearning,...
etc. etc. etc.
All of it includes essentially infinite amount of infinite rabbitholes, but it's worth it

Map of algorithms for extracting patterns from data

Statistical Methods
Descriptive Statistics
Central Tendency (Mean, Median, Mode, Geometric Mean, Harmonic Mean)
Dispersion (Range, Variance, Standard Deviation, Coefficient of Variation, Quartiles, Interquartile Range)
Skewness and Kurtosis
Inferential Statistics
Hypothesis Testing (Z-test, t-test, F-test, Chi-Square Test, ANOVA, MANOVA, ANCOVA)
Confidence Intervals
Non-parametric Tests (Mann-Whitney U, Wilcoxon Signed-Rank, Kruskal-Wallis, Friedman)
Regression Analysis
Linear Regression (Simple, Multiple)
Logistic Regression (Binary, Multinomial, Ordinal)
Polynomial Regression
Stepwise Regression
Ridge Regression
Lasso Regression
Elastic Net Regression
Bayesian Statistics
Bayesian Inference
Naive Bayes Classifier
Bayesian Networks
Markov Chain Monte Carlo (MCMC) Methods
Survival Analysis
Kaplan-Meier Estimator
Cox Proportional Hazards Model
Spatial Statistics
Kriging
Spatial Autocorrelation (Moran's I, Geary's C)
Machine Learning
Supervised Learning
Classification
Decision Trees & Random Forests
Naive Bayes (Gaussian, Multinomial, Bernoulli)
Support Vector Machines (SVM) (Linear, RBF, Polynomial)
k-Nearest Neighbors (k-NN)
Logistic Regression
Neural Networks (Feedforward, Convolutional, Recurrent)
Gradient Boosting Machines (GBM)
AdaBoost
XGBoost
LightGBM
CatBoost
Regression
Linear Regression
Polynomial Regression
Support Vector Regression (SVR)
Decision Trees & Random Forests
Neural Networks (Feedforward, Convolutional, Recurrent)
Gradient Boosting Machines (GBM)
AdaBoost
XGBoost
LightGBM
CatBoost
Unsupervised Learning
Clustering
k-Means
Mini-Batch k-Means
Hierarchical Clustering (Agglomerative, Divisive)
DBSCAN
OPTICS
Mean Shift
Gaussian Mixture Models
Fuzzy C-Means
Dimensionality Reduction
Principal Component Analysis (PCA)
Kernel PCA
Incremental PCA
t-SNE
UMAP
Isomap
Locally Linear Embedding (LLE)
Independent Component Analysis (ICA)
Non-Negative Matrix Factorization (NMF)
Latent Dirichlet Allocation (LDA)
Autoencoders (Vanilla, Variational, Denoising)
Association Rule Mining
Apriori
FP-Growth
ECLAT
Semi-Supervised Learning
Self-Training
Co-Training
Graph-Based Methods
Transductive SVM
Generative Models
Reinforcement Learning
Q-Learning
SARSA
Deep Q Networks (DQN)
Policy Gradients (REINFORCE, Actor-Critic)
Proximal Policy Optimization (PPO)
Monte Carlo Methods
Temporal Difference Learning
AlphaZero
Ensemble Methods
Bagging
Boosting (AdaBoost, Gradient Boosting, XGBoost, LightGBM, CatBoost)
Stacking
Voting (Majority, Weighted, Soft)
Random Subspace Method
Rotation Forests
Deep Learning
Feedforward Neural Networks
Convolutional Neural Networks (CNN)
LeNet
AlexNet
VGGNet
ResNet
Inception
DenseNet
EfficientNet
Recurrent Neural Networks (RNN)
Long Short-Term Memory (LSTM)
Gated Recurrent Units (GRU)
Bidirectional RNNs
Transformers
Attention Mechanism
Self-Attention
Multi-Head Attention
BERT
GPT
Transformer-XL
XLNet
Autoencoders
Vanilla Autoencoders
Variational Autoencoders (VAE)
Denoising Autoencoders
Sparse Autoencoders
Generative Adversarial Networks (GANs)
Vanilla GANs
Deep Convolutional GANs (DCGANs)
Conditional GANs
Wasserstein GANs (WGANs)
Cycle GANs
StyleGANs
Self-Organizing Maps (SOMs)
Deep Belief Networks (DBNs)
Deep Reinforcement Learning
Deep Q Networks (DQN)
Double DQN
Dueling DQN
Deep Deterministic Policy Gradient (DDPG)
Asynchronous Advantage Actor-Critic (A3C)
Time Series Analysis
Exploratory Data Analysis
Seasonality
Trend
Cyclicality
Autocorrelation
Partial Autocorrelation
Smoothing Techniques
Moving Averages (Simple, Weighted, Exponential)
Holt-Winters (Additive, Multiplicative)
Kalman Filter
Decomposition Methods
Classical Decomposition (Additive, Multiplicative)
STL Decomposition
Regression-based Methods
Linear Regression
Autoregressive Models (AR)
Moving Average Models (MA)
Autoregressive Moving Average Models (ARMA)
Autoregressive Integrated Moving Average Models (ARIMA)
Seasonal ARIMA (SARIMA)
Vector Autoregression (VAR)
State Space Models
Exponential Smoothing State Space Models (ETS)
Structural Time Series Models
Dynamic Linear Models (DLMs)
Machine Learning Methods
Prophet
Recurrent Neural Networks (RNNs)
Long Short-Term Memory (LSTM)
Gated Recurrent Units (GRUs)
Temporal Convolutional Networks (TCNs)
XGBoost
Ensemble Methods
Bagging
Boosting
Stacking
Anomaly Detection
Statistical Process Control
Isolation Forests
Robust PCA
Causality Analysis
Granger Causality
Vector Autoregression (VAR)
Convergent Cross Mapping (CCM)
Anomaly Detection
Statistical Methods
Z-Score
Interquartile Range (IQR)
Mahalanobis Distance
Kernel Density Estimation (KDE)
Clustering-Based Methods
k-Means
DBSCAN
Density-Based Methods
Local Outlier Factor (LOF)
Connectivity-Based Outlier Factor (COF)
Subspace Outlier Detection (SOD)
Distance-Based Methods
k-Nearest Neighbors (k-NN)
Ensemble Methods
Isolation Forest
Feature Bagging
Subsampling
One-Class Classification
One-Class SVM
Support Vector Data Description (SVDD)
Autoencoder-based Methods
Probabilistic Methods
Gaussian Mixture Models (GMMs)
Hidden Markov Models (HMMs)
Bayesian Networks
Natural Language Processing (NLP)
Text Preprocessing
Tokenization
Stop Word Removal
Stemming & Lemmatization
Part-of-Speech (POS) Tagging
Named Entity Recognition (NER)
Parsing
Text Representation
Bag-of-Words (BoW)
TF-IDF
Word Embeddings (Word2Vec, GloVe, FastText)
Sentence Embeddings (Doc2Vec, Sent2Vec)
Contextual Embeddings (ELMo, BERT, GPT)
Text Classification
Naive Bayes
Support Vector Machines (SVM)
Logistic Regression
Decision Trees & Random Forests
Neural Networks (CNNs, RNNs, Transformers)
Sequence Labeling
Hidden Markov Models (HMMs)
Conditional Random Fields (CRFs)
Recurrent Neural Networks (RNNs)
Transformers
Topic Modeling
Latent Dirichlet Allocation (LDA)
Non-Negative Matrix Factorization (NMF)
Latent Semantic Analysis (LSA)
Hierarchical Dirichlet Process (HDP)
Text Summarization
Extractive Methods (TextRank, LexRank)
Abstractive Methods (Seq2Seq Models, Transformers)
Machine Translation
Statistical Machine Translation (SMT)
Neural Machine Translation (NMT)
Seq2Seq Models
Attention Mechanisms
Transformers
Sentiment Analysis
Lexicon-based Methods
Machine Learning Methods (Naive Bayes, SVM, Logistic Regression)
Deep Learning Methods (CNNs, RNNs, Transformers)
Language Modeling
N-gram Models
Neural Language Models (RNNs, LSTMs, GRUs)
Transformers (GPT, BERT)
Text Generation
Rule-based Methods
Statistical Language Models
Neural Language Models (RNNs, LSTMs, GRUs)
Transformers (GPT, BERT)
Information Retrieval
Boolean Models
Vector Space Models (TF-IDF)
Probabilistic Models (BM25)
Learning to Rank (LTR)
Named Entity Recognition (NER)
Rule-based Methods
Machine Learning Methods (CRFs, HMMs)
Deep Learning Methods (BiLSTM-CRF, Transformers)
Relationship Extraction
Pattern-based Methods
Machine Learning Methods (SVMs, CRFs)
Deep Learning Methods (CNNs, RNNs, Transformers)
Coreference Resolution
Rule-based Methods
Machine Learning Methods (Mention-Pair, Entity-Mention)
Deep Learning Methods (Mention Ranking, End-to-End Models)
Computer Vision
Image Preprocessing
Pixel-level Operations (Scaling, Cropping, Rotation, Flipping)
Filtering (Gaussian, Median, Bilateral)
Edge Detection (Sobel, Canny, Laplacian)
Morphological Operations (Erosion, Dilation, Opening, Closing)
Feature Extraction
Scale-Invariant Feature Transform (SIFT)
Speeded Up Robust Features (SURF)
Oriented FAST and Rotated BRIEF (ORB)
Histogram of Oriented Gradients (HOG)
Local Binary Patterns (LBP)
Object Detection
Viola-Jones
Sliding Window
Deformable Part Models (DPM)
Region-based CNN (R-CNN, Fast R-CNN, Faster R-CNN)
You Only Look Once (YOLO)
Single Shot MultiBox Detector (SSD)
RetinaNet
Semantic Segmentation
Fully Convolutional Networks (FCNs)
U-Net
DeepLab
Mask R-CNN
Instance Segmentation
Mask R-CNN
PANet
Image Classification
Convolutional Neural Networks (CNNs)
Transfer Learning (VGG, ResNet, Inception, DenseNet, EfficientNet)
Ensemble Methods (Bagging, Boosting)
Object Tracking
Kalman Filter
Particle Filter
Optical Flow
Siamese Networks
Correlation Filter
Pose Estimation
Deformable Part Models (DPM)
Convolutional Pose Machines (CPMs)
Stacked Hourglass Networks
OpenPose
Face Recognition
Eigenfaces
Local Binary Patterns Histograms (LBPH)
FaceNet
DeepFace
DeepID
Generative Models
Variational Autoencoders (VAEs)
Generative Adversarial Networks (GANs)
Neural Style Transfer
Deep Dream
3D Computer Vision
Structure from Motion (SfM)
Simultaneous Localization and Mapping (SLAM)
Stereo Vision
Point Cloud Processing
Voxel-based Methods
Graph Analytics
Graph Representation
Adjacency Matrix
Adjacency List
Edge List
Incidence Matrix
Graph Traversal
Breadth-First Search (BFS)
Depth-First Search (DFS)
Shortest Path Algorithms
Dijkstra's Algorithm
Bellman-Ford Algorithm
A* Search
Floyd-Warshall Algorithm
Centrality Measures
Degree Centrality
Betweenness Centrality
Closeness Centrality
Eigenvector Centrality
PageRank
HITS (Hubs and Authorities)
Community Detection
Girvan-Newman Algorithm
Louvain Algorithm
Infomap
Spectral Clustering
Stochastic Block Models
Link Prediction
Common Neighbors
Jaccard Coefficient
Adamic-Adar Index
Preferential Attachment
Katz Index
Matrix Factorization
Graph Embeddings
DeepWalk
node2vec
Graph Convolutional Networks (GCNs)
GraphSAGE
Graph Attention Networks (GATs)
Subgraph Matching
Ullmann's Algorithm
VF2 Algorithm
Graph Kernels
Network Motifs
Motif Counting
Motif Discovery
Temporal Graph Analysis
Temporal Motifs
Dynamic Community Detection
Temporal Link Prediction
Graph Neural Networks (GNNs)
Graph Convolutional Networks (GCNs)
Graph Attention Networks (GATs)
Graph Recurrent Networks (GRNs)
Graph Autoencoders
Graph Generative Models
Recommender Systems
Content-based Filtering
TF-IDF
Cosine Similarity
Jaccard Similarity
Collaborative Filtering
User-based Collaborative Filtering
Item-based Collaborative Filtering
Matrix Factorization (Singular Value Decomposition, Non-Negative Matrix Factorization)
Factorization Machines
Probabilistic Matrix Factorization
Hybrid Methods
Weighted Hybrid
Switching Hybrid
Cascade Hybrid
Feature Combination
Meta-level
Context-Aware Recommender Systems
Contextual Pre-filtering
Contextual Post-filtering
Contextual Modeling
Deep Learning-based Recommender Systems
Neural Collaborative Filtering
Deep Matrix Factorization
Autoencoders
Convolutional Neural Networks (CNNs)
Recurrent Neural Networks (RNNs)
Graph Neural Networks (GNNs)
Evaluation Metrics
Precision and Recall
Mean Average Precision (MAP)
Normalized Discounted Cumulative Gain (NDCG)
Mean Reciprocal Rank (MRR)
Coverage
Diversity
Novelty
Serendipity
Optimization Algorithms
Gradient Descent
Batch Gradient Descent
Stochastic Gradient Descent (SGD)
Mini-batch Gradient Descent
Newton's Method
Quasi-Newton Methods
BFGS
L-BFGS
Conjugate Gradient Methods
Momentum
Nesterov Accelerated Gradient (NAG)
Adagrad
Adadelta
RMSprop
Adam
AdaMax
Nadam
AMSGrad
Evolutionary Algorithms
Genetic Algorithms
Evolutionary Strategies
Particle Swarm Optimization (PSO)
Ant Colony Optimization (ACO)
Differential Evolution
Swarm Intelligence Algorithms
Artificial Bee Colony (ABC)
Firefly Algorithm
Cuckoo Search
Bat Algorithm
Simulated Annealing
Tabu Search
Hill Climbing
Gradient-Free Optimization
Nelder-Mead Method
Pattern Search
Bayesian Optimization
Constrained Optimization
Lagrange Multipliers
Karush-Kuhn-Tucker (KKT) Conditions
Interior Point Methods
Penalty Methods
Multi-Objective Optimization
Weighted Sum Method
ε-Constraint Method
Pareto Optimization
Non-dominated Sorting Genetic Algorithm (NSGA-II)
Strength Pareto Evolutionary Algorithm (SPEA2)

This comprehensive map covers a wide range of algorithms and techniques used for extracting patterns and insights from various types of data, including tabular data, time series data, text data, image data, and graph data. It encompasses statistical methods, machine learning algorithms (both traditional and deep learning-based), natural language processing techniques, computer vision algorithms, graph analytics, recommender systems, and optimization algorithms.

The choice of algorithm depends on the specific problem at hand, the nature and structure of the data, the desired outcome, and the trade-offs between accuracy, interpretability, scalability, and computational efficiency. It is essential to have a good understanding of the strengths and limitations of each algorithm and to experiment with different approaches to find the most suitable one for a given task.

Furthermore, data preprocessing, feature engineering, model selection, hyperparameter tuning, and model evaluation are crucial steps in the data analysis pipeline that can significantly impact the performance of the chosen algorithm. It is also important to consider the ethical implications and potential biases associated with the use of these algorithms, especially in sensitive domains such as healthcare, finance, and criminal justice.

Additional metadata¶