Aspect | Feature Selection | Feature Engineering |
Purpose | Choose relevant features | Create informative features |
Objective | Improve model performance by eliminating irrelevant or redundant features | Improve model's ability to capture patterns and relationships |
Techniques | Correlation analysis, mutual information, feature importance scores, recursive feature elimination | Scaling, one-hot encoding, interaction terms, mathematical operations, date-related feature extraction |
Automation | Can often be automated using statistical methods and algorithms | May require domain knowledge and human expertise |
Examples | SelectKBest, SelectFromModel, Recursive Feature Elimination (RFE) | One-hot encoding, polynomial feature creation, date-related feature extraction |
Vector Databases Usage: Typically used for vector search use cases such as visual, semantic, and multimodal search. More recently, they are paired with generative AI text models for conversational search experiences. Development Process: Begins with building an embedding model designed to encode a corpus (e.g., product images) into vectors. The data import process is referred to as data hydration. Application Development: Application developers utilize the database to search for similar products. This involves encoding a product image and using the vector to query for similar images. k-Nearest Neighbor (k-NN) Indexes: Within the model, k-nearest neighbor (k-NN) indexes facilitate efficient retrieval of vectors. A distance function like cosine is applied to rank results by similarity.
Comments