Milvus 是一种高性能、高扩展性的向量数据库,可在从笔记本电脑到大型分布式系统等各种环境中高效运行。它既可以开源软件的形式提供,也可以云服务的形式提供。 Milvus 是 LF AI & Data Foundation 下的一个开源项目,以 Apache 2.0 许可发布。大多数贡献者都是高性能计算(HPC)领域的专家,擅长构建大型系统和优化硬件感知代码。核心贡献者包括来自 Zilliz、ARM、NVIDIA、AMD、英特尔、Meta、IBM、Salesforce、阿里巴巴和微软的专业人士
Deeplearning4j(DL4J)是一个开源的深度学习框架,专门为Java和Scala开发。它支持分布式计算,适合在大数据环境中运行,比如与Hadoop或Spark集成。DL4J的特点包括:
多种网络架构:支持多种深度学习模型,包括卷积神经网络(CNN)、循环神经网络(RNN)和深度信念网络(DBN)。
集成与可扩展性:能够与大数据处理框架(如Apache Spark)和数据处理库(如ND4J)紧密集成,方便处理大规模数据集。
易于使用:提供高层API,简化模型构建和训练过程,同时也允许用户对底层实现进行细致的控制。
模型导入与导出:支持从其他框架(如Keras和TensorFlow)导入模型,并将训练好的模型导出为多种格式,以便于部署。
性能优化:支持多种硬件加速,包括GPU加速,能够提高训练和推理的效率。
支持多种应用场景:广泛应用于计算机视觉、自然语言处理、推荐系统等多个领域。
Deeplearning4j是企业和开发者进行深度学习开发和研究的强大工具,特别适合于需要与Java生态系统兼容的场景。
First, we’ll need an instance of Milvus DB. The easiest and quickest way is to get a fully managed free Milvus DB instance provided by Zilliz Cloud: https://zilliz.com/
For this, we’ll need to register for a Zilliz cloud account and follow the documentation for creating a free DB cluster.

利用Milvus和deeplearning4j实现图搜图功能
<project xmlns="http://maven.apache.org/POM/4.0.0"xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"xsi:schemaLocation="http://maven.apache.org/POM/4.0.0 http://maven.apache.org/xsd/maven-4.0.0.xsd"><parent><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-parent</artifactId><version>3.2.1</version></parent><modelVersion>4.0.0</modelVersion><artifactId>Milvus</artifactId><properties><maven.compiler.source>17</maven.compiler.source><maven.compiler.target>17</maven.compiler.target><deeplearning4j.version>1.0.0-M2.1</deeplearning4j.version><nd4j.version>1.0.0-M2.1</nd4j.version></properties><dependencies><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-web</artifactId></dependency><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-autoconfigure</artifactId></dependency><dependency><groupId>org.springframework.boot</groupId><artifactId>spring-boot-starter-test</artifactId><scope>test</scope></dependency><dependency><groupId>io.milvus</groupId><artifactId>milvus-sdk-java</artifactId><version>2.4.0</version></dependency><dependency><groupId>org.deeplearning4j</groupId><artifactId>deeplearning4j-zoo</artifactId><version>${deeplearning4j.version}</version></dependency><dependency><groupId>org.nd4j</groupId><artifactId>nd4j-native-platform</artifactId><version>${nd4j.version}</version></dependency><dependency><groupId>org.datavec</groupId><artifactId>datavec-data-image</artifactId><version>${deeplearning4j.version}</version></dependency><dependency><groupId>org.deeplearning4j</groupId><artifactId>deeplearning4j-core</artifactId><version>${deeplearning4j.version}</version></dependency><dependency><groupId>org.deeplearning4j</groupId><artifactId>deeplearning4j-modelimport</artifactId><version>${deeplearning4j.version}</version></dependency></dependencies><build><pluginManagement><plugins><plugin><groupId>org.apache.maven.plugins</groupId><artifactId>maven-compiler-plugin</artifactId><version>3.8.1</version><configuration><fork>true</fork><failOnError>false</failOnError></configuration></plugin><plugin><groupId>org.apache.maven.plugins</groupId><artifactId>maven-surefire-plugin</artifactId><version>2.22.2</version><configuration><forkCount>0</forkCount><failIfNoTests>false</failIfNoTests></configuration></plugin></plugins></pluginManagement></build></project>
package com.et.imagesearch;import org.deeplearning4j.zoo.model.ResNet50;import org.deeplearning4j.zoo.ZooModel;import org.deeplearning4j.nn.graph.ComputationGraph;import org.nd4j.linalg.api.ndarray.INDArray;import org.nd4j.linalg.dataset.api.preprocessor.ImagePreProcessingScaler;import org.datavec.image.loader.NativeImageLoader;import java.io.File;import java.io.IOException;public class FeatureExtractor {private ComputationGraph model;public FeatureExtractor() throws IOException {try {ZooModel<ComputationGraph> zooModel = ResNet50.builder().build();model = (ComputationGraph) zooModel.initPretrained();} catch (Exception e) {throw new IOException("Failed to initialize the pre-trained model: " + e.getMessage(), e);}}public INDArray extractFeatures(File imageFile) throws IOException {NativeImageLoader loader = new NativeImageLoader(224, 224, 3);INDArray image = loader.asMatrix(imageFile);ImagePreProcessingScaler scaler = new ImagePreProcessingScaler(0, 1);scaler.transform(image);return model.outputSingle(image);}}
加载图像: 使用
NativeImageLoader将图像加载为一个INDArray,并将图像的大小调整为 224x224 像素,通道数为 3(即 RGB 图像)。预处理图像: 使用
ImagePreProcessingScaler将图像数据缩放到 [0, 1] 的范围,以便模型可以更好地处理。特征提取: 使用模型的
outputSingle()方法将预处理后的图像输入模型,返回提取的特征向量。
package com.et.imagesearch;import io.milvus.client.*;import io.milvus.param.*;import io.milvus.param.collection.*;import io.milvus.param.dml.*;import io.milvus.grpc.*;import io.milvus.param.index.CreateIndexParam;import org.nd4j.linalg.api.ndarray.INDArray;import java.util.ArrayList;import java.util.Arrays;import java.util.Collections;import java.util.List;import java.util.stream.Collectors;public class MilvusManager {private MilvusServiceClient milvusClient;public MilvusManager() {milvusClient = new MilvusServiceClient(ConnectParam.newBuilder().withUri("https://xxx.gcp-us-west1.cloud.zilliz.com").withToken("xxx").build());}public void createCollection() {FieldType idField = FieldType.newBuilder().withName("id").withDataType(DataType.Int64).withPrimaryKey(true).build();FieldType vectorField = FieldType.newBuilder().withName("embedding").withDataType(DataType.FloatVector).withDimension(1000).build();CreateCollectionParam createCollectionParam = CreateCollectionParam.newBuilder().withCollectionName("image_collection").withDescription("Image collection").withShardsNum(2).addFieldType(idField).addFieldType(vectorField).build();milvusClient.createCollection(createCollectionParam);}public void insertData(long id, INDArray features) {List<Long> ids = Collections.singletonList(id);float[] floatArray = features.toFloatVector();List<Float> floatList = new ArrayList<>();for (float f : floatArray) {floatList.add(f);}List<List<Float>> vectors = Collections.singletonList(floatList);List<InsertParam.Field> fields = new ArrayList<>();fields.add(new InsertParam.Field("id",ids));fields.add(new InsertParam.Field("embedding", vectors));InsertParam insertParam = InsertParam.newBuilder().withCollectionName("image_collection").withFields(fields).build();milvusClient.insert(insertParam);}public void flush() {milvusClient.flush(FlushParam.newBuilder().withCollectionNames(Collections.singletonList("image_collection")).withSyncFlush(true).withSyncFlushWaitingInterval(50L).withSyncFlushWaitingTimeout(30L).build());}public void buildindex() {// build indexSystem.out.println("Building AutoIndex...");final IndexType INDEX_TYPE = IndexType.AUTOINDEX; // IndexTypelong startIndexTime = System.currentTimeMillis();R<RpcStatus> indexR = milvusClient.createIndex(CreateIndexParam.newBuilder().withCollectionName("image_collection").withFieldName("embedding").withIndexType(INDEX_TYPE).withMetricType(MetricType.L2).withSyncMode(Boolean.TRUE).withSyncWaitingInterval(500L).withSyncWaitingTimeout(30L).build());long endIndexTime = System.currentTimeMillis();System.out.println("Succeed in " + (endIndexTime - startIndexTime) / 1000.00 + " seconds!");}}
createCollection():
id: 主键,类型为
Int64。embedding: 特征向量,类型为
FloatVector,维度为 1000。创建一个名为
image_collection的集合,包含两个字段:使用
CreateCollectionParam指定集合的名称、描述和分片数量,并调用createCollection方法执行创建操作。insertData(long id, INDArray features):
插入一条新数据到
image_collection集合中。将
INDArray类型的特征向量转换为List<List<Float>>格式,以满足 Milvus 的插入要求。创建一个
InsertParam实例,包含 ID 和特征向量,并调用insert方法执行插入操作。flush():
刷新
image_collection集合,确保所有待处理的插入操作都被写入数据库。使用
FlushParam配置同步刷新模式和等待参数,确保操作的可靠性。buildindex():
构建
image_collection集合中embedding字段的索引,以加快后续的相似性搜索。使用
CreateIndexParam指定集合名称、字段名称、索引类型(自动索引)和度量类型(L2距离)。调用
createIndex方法执行索引创建,并输出所用时间。
package com.et.imagesearch;import io.milvus.client.MilvusServiceClient;import io.milvus.grpc.SearchResults;import io.milvus.param.ConnectParam;import io.milvus.param.MetricType;import io.milvus.param.R;import io.milvus.param.dml.SearchParam;import io.milvus.response.SearchResultsWrapper;import org.nd4j.linalg.api.ndarray.INDArray;import java.util.ArrayList;import java.util.Arrays;import java.util.Collections;import java.util.List;import java.util.stream.Collectors;public class ImageSearcher {private MilvusServiceClient milvusClient;public ImageSearcher() {milvusClient = new MilvusServiceClient(ConnectParam.newBuilder().withUri("https://ixxxxx.gcp-us-west1.cloud.zilliz.com").withToken("xxx").build());}public void search(INDArray queryFeatures) {float[] floatArray = queryFeatures.toFloatVector();List<Float> floatList = new ArrayList<>();for (float f : floatArray) {floatList.add(f);}List<List<Float>> vectors = Collections.singletonList(floatList);SearchParam searchParam = SearchParam.newBuilder().withCollectionName("image_collection").withMetricType(MetricType.L2).withTopK(5).withVectors(vectors).withVectorFieldName("embedding").build();R<SearchResults> searchResults = milvusClient.search(searchParam);System.out.println("Searching vector: " + queryFeatures.toFloatVector());System.out.println("Result: " + searchResults.getData().getResults().getFieldsDataList());}}
特征转换: 将
INDArray转换为float[]数组,然后将其转换为List<Float>。这是因为 Milvus 需要特定格式的向量输入。构建搜索参数: 创建一个
SearchParam对象,指定要搜索的集合名称、度量类型(例如 L2 距离)、返回的最相似的前 K 个结果、向量字段名称以及搜索的向量数据。执行搜索: 使用
milvusClient的search方法执行搜索,并将结果存储在searchResults中。结果输出: 打印出搜索的特征向量和搜索结果。
package com.et.imagesearch;import org.nd4j.linalg.api.ndarray.INDArray;import java.io.File;import java.io.IOException;public class Main {public static void main(String[] args) throws IOException {FeatureExtractor extractor = new FeatureExtractor();MilvusManager milvusManager = new MilvusManager();ImageSearcher searcher = new ImageSearcher();milvusManager.createCollection();// images extractFile[] imageFiles = new File("/Users/liuhaihua/ai/ut-zap50k-images-square/Boots/Ankle/Columbia").listFiles();if (imageFiles != null) {for (int i = 0; i < imageFiles.length; i++) {INDArray features = extractor.extractFeatures(imageFiles[i]);milvusManager.insertData(i, features);}}milvusManager.flush();milvusManager.buildindex();// queryFile queryImage = new File("/Users/liuhaihua/ai/ut-zap50k-images-square/Boots/Ankle/Columbia/7247580.16952.jpg");INDArray queryFeatures = extractor.extractFeatures(queryImage);searcher.search(queryFeatures);}}
以上只是一些关键代码,所有代码请参见下面代码仓库
https://github.com/Harries/springboot-demo(Milvus)
启动main方法
查看云数据中数据
控制台可以看到搜图结果
https://cloud.zilliz.com/
https://github.com/deeplearning4j
http://www.liuhaihua.cn/archives/711596.html
版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请发送邮件至 举报,一经查实,本站将立刻删除。
如需转载请保留出处:https://bianchenghao.cn/bian-cheng-ri-ji/73550.html