Translate From (Mar 4, 2024). How to Use ResNet-50. Roboflow Blog
ResNet-50, introduced in the paper "Deep Residual Learning for Image Recognition" in 2015, is an image classification architecture developed by Microsoft Research. The default ResNet50 checkpoint was trained on the ImageNet-1k dataset, which contains data on 1,000 classes of images.
ResNet-50是在2015年的论文《深度残差学习用于图像识别》中介绍的,由微软研究院开发的图像分类架构。默认的ResNet50检查点是在包含1000个类别图像的ImageNet-1k数据集上训练的。
In this guide, we are going to walk through how to install ResNet-50 and classify images using ResNet-50.
在本指南中,我们将逐步介绍如何安装ResNet-50并使用ResNet-50对图像进行分类。
By the end of this guide, we will have code that assigns the class “forklift” to the following image: 通过本指南,我们将获得一段代码,该代码将类别“叉车”分配给以下图像:
ResNet-50是什么?
ResNet-50 is an image classification model architecture. Introduced in 2015, ResNet-50 won first place on the ILVRC 2015 image classification task. While many new model architectures that achieve strong performance have since been introduced, ResNet-50 is still a notable architecture in the history of computer vision.
ResNet-50是一种图像分类模型架构。它在2015年的ILVRC 2015图像分类任务中获得了第一名。尽管此后引入了许多性能强大的新模型架构,但ResNet-50仍然是计算机视觉历史上值得注意的架构。
The default ResNet checkpoint can identify any of 1,000 classes in the ImageNet-1k dataset. 默认的ResNet检查点可以在ImageNet-1k数据集中识别任何1,000个类别。
You can install ResNet-50 using the HuggingFace Transformers Python package. 你可以使用HuggingFace Transformers Python包来安装ResNet-50。
To get started, first install Transformers: 要开始,请先安装Transformers:
Once you have installed Transformers, you can load the microsoft/resnet-50 model in your code with the ResNetForImageClassification data loader.
安装完Transformers后,你可以使用ResNetForImageClassification数据加载器在代码中加载microsoft/resnet-50模型。
To get started, create a new Python file and add the following code:
要开始,请创建一个新的Python文件并添加以下代码:
In this code, we first open an image called image.jpg. Then, we load our model. We run inference on our model with the model(**inputs) function call. Finally, we retrieve the class with the highest confidence returned by our model.
在这段代码中,我们首先打开一个名为image.jpg的图像。然后,我们加载我们的模型。我们使用model(**inputs)函数调用在我们的模型上运行推理。最后,我们检索模型返回的最有信心的类别。
In the code above, replace image.jpg with the name of the image on which you want to run inference.
在上述代码中,将image.jpg替换为你想要运行推理的图像的名称。
结论和当前的分类景观
ResNet-50 is an image classification architecture introduced in 2015 and was trained on the ImageNet-1k dataset. You can train models on a custom dataset using the ResNet architecture if you want to identify your own classes.
ResNet-50是在2015年引入的图像分类架构,并在ImageNet-1k数据集上进行了训练。如果你想识别自己的类别,你可以使用ResNet架构在自定义数据集上训练模型。
While ResNet is several years old, the model is established as an image classification model. Since then, many new architectures have been introduced that allow you to fine-tune a model on a custom dataset, including: The Vision Transformer, FastViT, Ultralytics YOLOv8, ResNext.
尽管ResNet已经有几年历史,但该模型已被确立为图像分类模型。自那以后,引入了许多新的架构,允许你在自定义数据集上微调模型,包括:视觉变换器、FastViT、Ultralytics YOLOv8、ResNext。
There are also zero-shot classification models where you can use the model on arbitrary classes without fine-tuning models.
还有一些零样本分类模型,你可以在不微调模型的情况下对任意类别使用模型。
For example, you can use OpenAI CLIP to assign labels to images without fine-tuning the model. This is because CLIP has been trained on a large dataset with a wide range of descriptions.
例如,你可以使用OpenAI CLIP在不微调模型的情况下给图像分配标签。这是因为CLIP已经在包含广泛描述的大型数据集上进行了训练。
Zero-shot models like CLIP can be used on their own (i.e., for classification, content moderation), or used to auto-label framework like Autodistill for use in training a faster, fine-tuned vision model.
零样本模型如CLIP可以单独使用(例如,用于分类、内容审核),或用于自动标记框架如Autodistill,用于训练更快的、微调的视觉模型。