What is the class of this image

本文探讨了当前艺术对象分类的状态,并对比了几种常见的图像数据集,包括MNIST、CIFAR-10、CIFAR-100等,旨在为读者提供关于不同图像分类任务性能的深入理解。
部署运行你感兴趣的模型镜像

您可能感兴趣的与本文相关的镜像

Stable-Diffusion-3.5

Stable-Diffusion-3.5

图片生成
Stable-Diffusion

Stable Diffusion 3.5 (SD 3.5) 是由 Stability AI 推出的新一代文本到图像生成模型,相比 3.0 版本,它提升了图像质量、运行速度和硬件效率

翻译并改写为符合中文习惯的表达 Similarly, you can use this model for semantic segmentation as well. Semantic segmentation is the process of labeling each pixel and assigning them to the classes. While object detection deals with objects in bounding boxes, semantic segmentation creates a selection of the objects in a pixel-wise manner. Follow these steps: 1. T he first thing you need to do is to load your model: 2. The next step is to download the image that we want to perform segmentation on: Now that we have the image, we can use processor and model to get the output: Afterwards, you need to use the following functions to extract the result: However, in order to see the image, you can execute the following code: 3. T his code will convert the output of the model into the proper format to be visualized. Finally, you can see the image as shown in Figure 16.7, which is the identical semantically segmented image of the original image: Up to this point, you have learned how to use ViT models for image classification, object detection, and semantic segmentation. In the next section, you will learn about visual prompt models and how to use them. Visual prompt models Prompt-based models have been an attractive part of artificial intelligence in many aspects. These kinds of models can take guidance in the form of a pattern and create the respective output by understanding it. The prompt can be in many forms or data formats. Textual prompt-based models or visual prompt-based models are also available. A textual prompt is a free text that indicates what the model should do or provide as output. Similarly, a visual prompt is a visual guidance that helps the model understand the task or the instruction itself. Models such as CLIP are capable of understanding images and text at the same time and mapping them to a single vector space. In this vector space, text with similar semantic meaning to images (that visually present the same described objects or scenes
03-12
这个是库的 文档说明 react-native-vision Library for accessing VisionKit and visual applications of CoreML from React Native. iOS Only Incredibly super-alpha, and endeavors to provide a relatively thin wrapper between the underlying vision functionality and RN. Higher-level abstractions are @TODO and will be in a separate library. Installation yarn add react-native-vision react-native-swift react-native link Note react-native-swift is a peer dependency of react-native-vision. If you are running on a stock RN deployment (e.g. from react-native init) you will need to make sure your app is targeting IOS 11 or higher: yarn add react-native-fix-ios-version react-native link Since this module uses the camera, it will work much better on a device, and setting up permissions and codesigning in advance will help: yarn add -D react-native-camera-ios-enable yarn add -D react-native-setdevteam react-native link react-native setdevteam Then you are ready to run! react-native run-ios --device Command line - adding a Machine Learning Model with add-mlmodel react-native-vision makes it easier to bundle a pre-built machine learning model into your app. After installing, you will find the following command available: react-native add-mlmodel /path/to/mymodel.mlmodel You may also refere to the model from a URL, which is handy when getting something off the interwebs. For example, to apply the pre-built mobileNet model from apple, you can: react-native add-mlmodel https://docs-assets.developer.apple.com/coreml/models/MobileNet.mlmodel Note that the name of your model in the code will be the same as the filename minus the "mlmodel". In the above case, the model in code can be referenced as "MobileNet" Easy Start 1 : Full Frame Object Detection One of the most common easy use cases is just detecting what is in front of you. For this we use the VisionCamera component that lets you apply a model and get the classification via render props. Setup react-native init imagedetector; cd imagedetector yarn add react-native-swift react-native-vision yarn add react-native-fix-ios-version react-native-camera-ios-enable react-native-setdevteam react-native link react-native setdevteam Load your model with MobileNet A free download from Apple! react-native add-mlmodel https://docs-assets.developer.apple.com/coreml/models/MobileNet.mlmodel Add Some App Code import React from "react"; import { Text } from "react-native"; import { VisionCamera } from "react-native-vision"; export default () => ( <VisionCamera style={{ flex: 1 }} classifier="MobileNet"> {({ label, confidence }) => ( <Text style={{ width: "75%", fontSize: 50, position: "absolute", right: 50, bottom: 100 }} > {label + " :" + (confidence * 100).toFixed(0) + "%"} </Text> )} </VisionCamera> ); Easy Start 2: GeneratorView - for Style Transfer Most machine learning application are classifiers. But generators can be useful and a lot of fun. The GeneratorView lets you look at style transfer models that show how you can use deep learning techniques for creating whole new experiences. Setup react-native init styletest; cd styletest yarn add react-native-swift react-native-vision yarn add react-native-fix-ios-version react-native-camera-ios-enable react-native-setdevteam react-native link react-native setdevteam Load your model with add-mlmodel Apple has not published a style transfer model, but there are a few locations on the web where you can download them. Here is one: https://github.com/mdramos/fast-style-transfer-coreml So go to his github, navigate to his google drive, and then download the la_muse model to your personal Downloads directory. react-native add-mlmodel ~/Downloads/la_muse.mlmodel App Code This is the insanely short part. Note that the camera view is not necessary for viewing the style-transferred view: its just for reference. import React from "react"; import { GeneratorView, RNVCameraView } from "react-native-vision"; export default () => ( <GeneratorView generator="FNS-The-Scream" style={{ flex: 1 }}> <RNVCameraView style={{ position: "absolute", height: 200, width: 100, top: 0, right: 0 }} resizeMode="center" /> </GeneratorView> ); Easy Start 3: Face Camera Detect what faces are where in your camera view! Taking a page (and the model!) from (https://github.com/gantman/nicornot)[Gant Laborde's NicOrNot app], here is the entirety of an app that discerns whether the target is nicolas cage. Setup react-native init nictest; cd nictest yarn add react-native-swift react-native-vision yarn add react-native-fix-ios-version react-native-camera-ios-enable react-native-setdevteam react-native link react-native setdevteam Load your model with add-mlmodel react-native add-mlmodel https://s3.amazonaws.com/despiteallmyrage/MegaNic50_linear_5.mlmodel App Code import React from "react"; import { Text, View } from "react-native"; import { FaceCamera } from "react-native-vision"; import { Identifier } from "react-native-identifier"; export default () => ( <FaceCamera style={{ flex: 1 }} classifier="MegaNic50_linear_5"> {({ face, faceConfidence, style }) => face && (face == "nic" ? ( <Identifier style={{ ...style }} accuracy={faceConfidence} /> ) : ( <View style={{ ...style, justifyContent: "center", alignItems: "center" }} > <Text style={{ fontSize: 50, color: "red", opacity: faceConfidence }}> X </Text> </View> )) } </FaceCamera> ); Face Detection Component Reference FacesProvider Context Provider that extends <RNVisionProvider /> to detect, track, and identify faces. Props Inherits from <RNVisionProvider />, plus: interval: How frequently (in ms) to run the face detection re-check. (Basically lower values here keeps the face tracking more accurate) Default: 500 classifier: File URL to compiled MLModel (e.g. mlmodelc) that will be applied to detected faces updateInterval: How frequently (in ms) to update the detected faces - position, classified face, etc. Smaller values will mean smoother animation, but at the price of processor intensity. Default: 100 Example <FacesProvider isStarted={true} isCameraFront={true} classifier={this.state.classifier} > {/* my code for handling detected faces */} </FacesProvider> FacesConsumer Consumer of <FacesProvider /> context. As such, takes no props and returns a render prop function. Render Prop Members faces: Keyed object of information about the detected face. Elements of each object include: region: The key associated with this object (e.g. faces[k].region === k) x, y, height, width: Position and size of the bounding box for the detected face. faces: Array of top-5 results from face classifier, with keys label and confidence face: Label of top-scoring result from classifier (e.g. the face this is most likely to be) faceConfidence: Confidence score of top-scoring result above. Note that when there is no classifier specified, faces, face and faceConfidence are undefined Face Render prop generator to provision information about a single detected face. Can be instantiated by spread-propping the output of a single face value from <FacesConsumer> or by appling a faceID that maps to the key of a face. Returns null if no match. Props faceID: ID of the face (corresponding to the key of the faces object in FacesConsumer) Render Prop Members region: The key associated with this object (e.g. faces[k].region === k) x, y, height, width: Position and size of the bounding box for the detected face. Note These are adjusted for the visible camera view when you are rendering from that context. faces: Array of top-5 results from face classifier, with keys label and confidence face: Label of top-scoring result from classifier (e.g. the face this is most likely to be) faceConfidence: Confidence score of top-scoring result above. Note These arguments are the sam Faces A render-prop generator to provision information about all detected faces. Will map all detected faces into <Face> components and apply the children prop to each, so you have one function to generate all your faces. Designed to be similar to FlatMap implentation. Required Provider Context This component must be a descendant of a <FacesProvider> Props None Render Prop Members Same as <Face> above, but output will be mapped across all detected faces. Example of use is in the primary Face Recognizer demo code above. Props faceID: ID of the face applied. isCameraView: Whether the region frame information to generate should be camera-aware (e.g. is it adjusted for a preview window or not) Render Props This largely passes throught the members of the element that you could get from the faces collection from FaceConsumer, with the additional consideration that when isCameraView is set, style: A spreadable set of styling members to position the rectangle, in the same style as a RNVCameraRegion If faceID is provided but does not map to a member of the faces collection, the function will return null. Core Component References The package exports a number of components to facilitate the vision process. Note that the <RNVisionProvider /> needs to be ancestors to any others in the tree. So a simple single-classifier using dominant image would look something like: <RNVisionProvider isStarted={true}> <RNVDefaultRegion classifiers={[{url: this.state.FileUrlOfClassifier, max: 5}]}> {({classifications})=>{ return ( <Text> {classifications[this.state.FileUrlOfClassifier][0].label} </Text> }} </RNVDefaultRegion> </RNVisionProvider> RNVisionProvider Context provider for information captured from the camera. Allows the use of regional detection methods to initialize identification of objects in the frame. Props isStarted: Whether the camera should be activated for vision capture. Boolean isCameraFront: Facing of the camera. False for the back camera, true to use the front. Note only one camera facing can be used at a time. As of now, this is a hardware limitation. regions: Specified regions on the camera capture frame articulated as {x,y,width,height} that should always be returned by the consumer trackedObjects: Specified regions that should be tracked as objects, so that the regions returned match these object IDs and show current position. onRegionsChanged: Fires when the list of regions has been altered onDetectedFaces: Fires when the number of detected faces has changed Class imperative member detectFaces: Triggers one call to detect faces based on current active frame. Directly returns locations. RNVisionConsumer Consumer partner of RNVisionProvider. Must be its descendant in the node tree. Render Prop Members imageDimensions: Object representing size of the camera frame in {width, height} isCameraFront: Relaying whether camera is currently in selfie mode. This is important if you plan on displaying camera output, because in selfie mode a preview will be mirrored. regions: The list of detected rectangles in the most recently captured frame, where detection is driven by the RNVisionProvider props RNVRegion Props region: ID of the region (Note the default region, which is the whole frame, has an id of "" - blank.) classifiers: CoreML classifiers passed as file URLs to the classifier mlmodelc itself. Array generators: CoreML image generators passed as file URLs to the classifier mlmodelc itself. Array generators: CoreML models that generate a collection of output values passed as file URLs to the classifier mlmodelc itself. bottlenecks: A collection of CoreML models that take other CoreML model outputs as their inputs. Keys are the file URLs of the original models (that take an image as their input) and values are arrays of mdoels that generate the output passed via render props. onFrameCaptured: Callback to fire when a new image of the current frame in this region has been captured. Making non-null activates frame capture, setting to null turns it off. The callback passes a URL of the saved frame image file. Render Prop members key: ID of the region x, y, width, height: the elements of the frame containing the region. All values expressed as percentages of the overall frame size, so a 50x100 frame at origin 5,10 in a 500x500 frame would come across as {x: 0.01, y: 0.02, width: .1, height: .2}. Changes in these values are often what drives the re-render of the component (and therefore re-run of the render prop) confidence: If set, the confidence that the object identified as key is actually at this location. Used by tracked objects API of iOS Vision. Sometimes null. classifications: Collection, keyed by the file URL of the classifier passed in props, of collections of labels and probabilities. (e.g. {"file:///path/to/myclassifier.mlmodelc": {"label1": 0.84, "label2": 0.84}}) genericResults: Collection of generic results returned from generic models passed in via props to the region RNVDefaultRegion Convenience region that references the full frame. Same props as RNVRegion, except region is always set to "" - the full frame. Useful for simple style transfers or "dominant image" classifiers. Props Same as RNVRegion, with the exception that region is forced to "" Render Prop Members Same as RNVRegion, with the note that key will always be "" RNVCameraView Preview of the camera captured by the RNVisionProvider. Note that the preview is flipped in selfie mode (e.g. when isCameraFront is true) Props The properties of a View plus: gravity: how to scale the captured camera frame in the view. String. Valid values: fill: Fills the rectangle much like the "cover" in an Image resize: Leaves transparent (or style:{backgroundColor}) the parts of the rectangle that are left over from a resized version of the image. RNVCameraConsumer Render prop consumer for delivering additional context that regions will find helpful, mostly for rendering rectangles that map to the regions identified. Render Prop Members viewPortDimensions: A collection of {width, height} of the view rectangle. viewPortGravity: A pass-through of the gravity prop to help decide how to manage the math converting coordinates. RNVCameraRegion A compound consumer that blends the render prop members of RNVRegion and RNVCameraConsumer and adds a style prop that can position the region on a specified camera preview Props Same as RNVRegion Render Prop Members Includes members from RNVRegion and RNVCameraConsumer and adds: style: A pre-built colleciton of style prop members {position, width, height, left, top} that are designed to act in the context of the RNVCameraView rectangle. Spread-prop with your other style preferences (border? backgroundColor?) for easy on-screen representation. RNVImageView View for displaying output of image generators. Link it to , and the resulting image will display in this view. Useful for style transfer models. More performant because there is no round trip to JavaScript notifying of each frame update. Props id: the ID of an image generator model attached to a region. Usually is the file:/// URL of the .mlmodelc. Otherwise conforms to Image and View API. 请叫我如何做
11-06
Delphi 12.3 作为一款面向 Windows 平台的集成开发环境,由 Embarcadero Technologies 负责其持续演进。该环境以 Object Pascal 语言为核心,并依托 Visual Component Library(VCL)框架,广泛应用于各类桌面软件、数据库系统及企业级解决方案的开发。在此生态中,Excel4Delphi 作为一个重要的社区开源项目,致力于搭建 Delphi 与 Microsoft Excel 之间的高效桥梁,使开发者能够在自研程序中直接调用 Excel 的文档处理、工作表管理、单元格操作及宏执行等功能。 该项目以库文件与组件包的形式提供,开发者将其集成至 Delphi 工程后,即可通过封装良好的接口实现对 Excel 的编程控制。具体功能涵盖创建与编辑工作簿、格式化单元格、批量导入导出数据,乃至执行内置公式与宏指令等高级操作。这一机制显著降低了在财务分析、报表自动生成、数据整理等场景中实现 Excel 功能集成的技术门槛,使开发者无需深入掌握 COM 编程或 Excel 底层 API 即可完成复杂任务。 使用 Excel4Delphi 需具备基础的 Delphi 编程知识,并对 Excel 对象模型有一定理解。实践中需注意不同 Excel 版本间的兼容性,并严格遵循项目文档进行环境配置与依赖部署。此外,操作过程中应遵循文件访问的最佳实践,例如确保目标文件未被独占锁定,并实施完整的异常处理机制,以防数据损毁或程序意外中断。 该项目的持续维护依赖于 Delphi 开发者社区的集体贡献,通过定期更新以适配新版开发环境与 Office 套件,并修复已发现的问题。对于需要深度融合 Excel 功能的 Delphi 应用而言,Excel4Delphi 提供了经过充分测试的可靠代码基础,使开发团队能更专注于业务逻辑与用户体验的优化,从而提升整体开发效率与软件质量。 资源来源于网络分享,仅用于学习交流使用,请勿用于商业,如有侵权请联系我删除!
评论 1
添加红包

请填写红包祝福语或标题

红包个数最小为10个

红包金额最低5元

当前余额3.43前往充值 >
需支付:10.00
成就一亿技术人!
领取后你会自动成为博主和红包主的粉丝 规则
hope_wisdom
发出的红包
实付
使用余额支付
点击重新获取
扫码支付
钱包余额 0

抵扣说明:

1.余额是钱包充值的虚拟货币,按照1:1的比例进行支付金额的抵扣。
2.余额无法直接购买下载,可以购买VIP、付费专栏及课程。

余额充值