Artificial Intelligence

1. What is Artificial Intelligence?

A: AI is the simulation of human intelligence in machines that are programmed to think, learn, and solve problems like humans.

2. What are the types of AI?

Narrow AI (Weak AI): Performs a specific task (e.g., recommendation engines).
General AI (Strong AI): Machines with human-like cognitive abilities.
Super AI: Hypothetical AI surpassing human intelligence.

3. What is the difference between AI and Machine Learning?

AI: Broader concept where machines can perform smart tasks.
ML: Subset of AI, machines learn from data without being explicitly programmed.

4. What is the Turing Test?

A: Proposed by Alan Turing to determine if a machine can exhibit behavior indistinguishable from a human.

5. What is the difference between supervised, unsupervised, and reinforcement learning?

Supervised: Learn from labeled data (classification, regression).
Unsupervised: Find patterns in unlabeled data (clustering).
Reinforcement: Learn by trial and error with rewards.

6. What is overfitting?

A: When a model learns the noise in training data, performs well on train but poorly on unseen data.

7. How can you reduce overfitting?

Cross-validation
Pruning (in trees)
Regularization (L1, L2)
Dropout (in NN)
More data

8. What is underfitting?

A: Model too simple to capture underlying trend, performs poorly on both train and test data.

9. What is bias-variance tradeoff?

Bias: Error from wrong assumptions (underfitting).
Variance: Error from sensitivity to small fluctuations (overfitting).
Goal: balance both.

10. Explain the difference between classification and regression.

Classification: Predicts discrete labels (spam vs not spam).
Regression: Predicts continuous output (house price).

11. What is gradient descent?

A: Optimization algorithm to minimize a cost function by iteratively moving toward steepest descent.

12. What is learning rate?

A: Size of steps taken during gradient descent. Too high: overshoot; too low: slow convergence.

13. What is stochastic gradient descent (SGD)?

A: Uses a random subset (mini-batch) of data to compute gradient, faster but noisier.

14. What is a confusion matrix?

A: Table showing true positives, false positives, true negatives, false negatives for classification.

15. What is precision, recall, F1-score?

Precision: TP / (TP + FP) – how many selected items are relevant.
Recall: TP / (TP + FN) – how many relevant items are selected.
F1: Harmonic mean of precision & recall

16. What is deep learning?

A: Subset of ML using neural networks with many layers to model complex patterns.

17. What is a neural network?

A: Computational model inspired by the brain, made of interconnected nodes (neurons).

18. What is backpropagation?

A: Algorithm to update weights in a NN by computing gradient of loss wrt weights.

19. What is dropout?

A: Regularization technique where randomly selected neurons are ignored during training.

20. What is activation function?

A: Introduces non-linearity. Examples:

ReLU: max(0,x)
Sigmoid: outputs between 0-1
Tanh: outputs between -1 and 1

21. What is CNN?

A: Convolutional Neural Network, specialized for image data, uses filters to capture spatial hierarchies.

22. What is RNN?

A: Recurrent Neural Network, processes sequential data by maintaining a hidden state.

23. What are vanishing gradients?

A: In deep nets, gradients shrink exponentially making learning slow (common in RNN).

24. How does batch normalization help?

A: Normalizes layer inputs, speeds up training and can reduce overfitting.

25. What is NLP?

A: Field of AI focused on interaction between computers and human language.

26. What is tokenization?

A: Breaking text into smaller units (words or subwords).

27. What is stemming vs lemmatization?

Stemming: Cuts to root (play → play, playing → play).
Lemmatization: Maps to dictionary base form (better → good).

28. What is bag-of-words?

A: Text is represented as a set of word counts ignoring grammar & order.

29. What is TF-IDF?

A: Term Frequency-Inverse Document Frequency, weighs words by how important they are to a doc relative to a corpus.

30. What is word embedding?

A: Vector representation of words capturing context. Examples: Word2Vec, GloVe.

31. What are transformers?

A: Attention-based architecture (e.g., BERT, GPT) that handles sequence data without recurrence.

32. What is attention mechanism?

A: Allows models to focus on relevant parts of input sequence when predicting.

33. What is fairness in AI?

A: Ensuring models do not discriminate against protected groups.

34. What is model interpretability?

A: Ability to understand why a model made a prediction. Tools: SHAP, LIME.

35. What is adversarial example?

A: Small perturbations to input that fool a model (e.g., misclassifying images).

36. Give examples of AI applications.

Self-driving cars
Chatbots
Fraud detection
Medical imaging

37. What is reinforcement learning used for?

A: Games (AlphaGo), robotics, recommendation optimization.

38. What is recommendation system?

A: Predicts user preferences based on past behavior or similar users/items.

39. What is hyperparameter tuning?

A: Process of finding best hyperparameters (e.g., learning rate, depth).

40. Techniques for hyperparameter tuning?

Grid search
Random search
Bayesian optimization

41. What is early stopping?

A: Stops training when performance on validation set stops improving.

42. What is cross-validation?

A: Technique to assess model performance by partitioning data into folds.

43. What is the gradient?

A: Vector of partial derivatives indicating direction of steepest increase.

44. What is softmax?

A: Converts vector of scores to probabilities summing to 1.

45. What is entropy?

A: Measure of uncertainty or randomness.

46. What is ROC curve?

A: Graph of TPR vs FPR at different thresholds. AUC summarizes performance.

47. What is TensorFlow?

A: Open-source deep learning framework by Google.

48. What is PyTorch?

A: Popular deep learning library by Facebook, more pythonic & dynamic graphs.

49. What is scikit-learn used for?

A: Classical ML models, preprocessing, cross-validation.

50. What is Hugging Face Transformers?

A: Library for state-of-the-art NLP models.

51. What is a GAN?

A: Generative Adversarial Network. Consists of two nets — generator & discriminator — competing in a zero-sum game to produce realistic data.

52. Example application of GANs?

Image synthesis (deepfakes)

Super-resolution
Style transfer

53. What is LSTM?

A: Long Short-Term Memory, type of RNN that mitigates vanishing gradient with gates controlling information flow.

54. What is GRU?

A: Gated Recurrent Unit, simpler variant of LSTM with fewer gates.

55. What is an autoencoder?

A: Neural net trained to compress (encode) data and then reconstruct (decode), learning efficient representations.

56. What is a Variational Autoencoder (VAE)?

A: Adds probabilistic elements to autoencoders, allowing sampling from latent space.

57. What is transfer learning?

A: Using a pretrained model (like on ImageNet) and fine-tuning it on a new task.

58. What is fine-tuning vs feature extraction?

Feature extraction: freeze base layers, train head.
Fine-tuning: unfreeze some/all layers, train with low LR.

59. What is one-hot encoding?

A: Represent categorical variables as binary vectors.

60. Why is normalization important?

A: Scales input data, speeds up convergence. E.g., mean=0, std=1.

61. What is Q-learning?

A: Model-free RL algorithm that learns value of actions in states to maximize cumulative reward.

62. What is Bellman Equation?

A: Recursive equation that describes relationship between value of state and values of successor states.

63. What is epsilon-greedy policy?

A: With probability ε, explore random action; otherwise exploit best known action.

64. What is a policy gradient?

A: Directly optimizes the policy (probability distribution over actions) via gradient ascent.

65. What is the difference between value-based and policy-based RL?

Value-based: learns value functions (Q-learning, DQN).
Policy-based: directly optimizes policy (REINFORCE).

Token embedding: maps words/subwords to vectors
Positional embedding: injects order info into input.

REST API (FastAPI, Flask)
TensorFlow Serving
TorchServe

81. PyTorch vs TensorFlow main differences?

PyTorch: dynamic graphs, easier debugging.
TensorFlow: static graphs (TF2 eager by default), production optimizations.

What is silhouette score? A: Measures how similar a point is to its own cluster vs other clusters.

Lower learning rate
Use gradient clipping
Try batch normalization

97. What to check if accuracy doesn’t improve at all?

Learning rate too high or too low
Wrong labels
Data leakage

98. How to debug training instability?

Plot learning curves, try smaller architecture first.

99. How to deploy low-latency models?

Quantization
Pruning
Use GPUs or edge accelerators.

100. What is your favorite recent advancement in AI?

(Open-ended — could mention diffusion models, ChatGPT, AlphaFold, generative video.)

Artificial Intelligence

On this page