Unit 4: Classification and Clustering

On this page, you will find Most important and Mostly asked previous year questions in your b tech semester exam from unit 4 Classification and Clustering of the subject Data Warehouse and Mining. 


  1. Write a note on classification and clustering.
  2. Discuss why analytical characterization and attribute relevance analysis are needed.
  3. Describe statistical measures in large database.
  4. With the help of suitable example, explain data discrimination in brief.
  5. Given the following set of value [1,3,9,15,20]. Determine the Jack knife estimate for both the mean and standard deviation the mean.
  6. Write and explain statistical-based algorithm.
  7. What are the main purpose of statistic used in data mining?
  8. Explain simple approach of distance based algorithm.
  9. Write a short note on decision tree based algorithm.
  10. Explain K nearest neighbours in context to distance based algorithm.
  11. Write a note on prediction and classification.
  12. What are different classification techniques? Discuss issues regarding classification and prediction.
  13. Describe various issue regarding classification and prediction.
  14. Explain the algorithm for classification by decision tree induction.
  15. What do you mean by decision tree? Describe ID3 algorithm of the decision tree. Why it is unsuitable for data mining application.
  16. Describe classification. Briefly outline the major ideas of Bayesian classification.
  17. What is Naive Bayesian classification ? Why is Naive Bayesian classification called “Naive” ?
  18. What is clustering ? How is this different from classification ? Explain any one approach for clustering.
  19. Write the advantage and disadvantage of clustering.
  20. Explain the data type that often occur in cluster analysis and briefly explain how to preprocess that data for clustering?
  21. Discuss the various approach of clustering.
  22. Explain the various requirements of clustering in data mining.
  23. Write a note on agglomerative hierarchical clustering .
  24. Write a note on partitioning methods.
  25. Explain the k-mean clustering algorithm.
  26. Explain the Nearest neighbor algorithm.
  27. Explain Squared error clustering algorithm.
  28. Explain Partitioning around methods (PAM) algorithm.
  29. Explain Chameleon hierarchical clustering.
  30. Write the basic difference between clustering and classification. Describe the density-based clustering method based on connected regions with sufficiently high density (DBSCAN).
  31. Write short notes on STING.
  32. Write a short note on OPTICS density-based clustering.
  33. Describe CLIQUE algorithm.
  34. Explain market basket analysis . Describe the concept of association rule mining.
  35. Describe the mining single dimensional boolean association rules from transactional database.
  36. Describe the Apriori algorithm for FIM (Frequent Itemset Mining) and verify it through suitable example.
  37. Why is the task of mining frequent item sets difficult? Explain the reasons.
  38. Explain the mining multidimensional association rules from relational databases and data warehouse.
  39. Describe the Apriori algorithm : Finding frequent item sets using candidate generation.
  40. Explain mining multilevel association rules from transactional databases.
  41. What do you mean by neural network? Explain multiplayer feed-forward neural network. Differentiate between feed-forward and feedback system.
  42. Describe neural network . How the neural network is useful in classification ? Explain.
  43. Q35). Write a note on Backpropagation algorithm.