Tag Archives: machine learning

‘ SmartLens’ app been developed by a high schooler is a step towards all-purpose visual examine

/ by / Tags: ,

A couple of years ago I was eagerly watchful of an app that would identify whatever it is you objected it at. Moves out the problem was much harder than anyone expected — but that didn’t stop senior high school senior Michael Royzen from trying. His app, SmartLens , attempts to solve the problem of encountering something and wanting to identify and understand better it — with mixed success, to be sure, but it’s something I don’t mind having in my pocket.

Royzen reached out to me a while back and I was curious — as well as skeptical — about the notion that where the likes of Google and Apple have so far miscarried( or at least is impossible to freeing anything good ), a high schooler working in his spare time would succeed. I fulfilled him at a coffee shop to see the app in action and was agreeably surprised, but a bit baffled.

The idea is simple, of course: You time your phone’s camera at something and the app attempts to identify it exploiting an enormous but highly optimized classification agent civilized on tens of millions of personas. It is connected to Wikipedia and Amazon to let you immediately understand better what you’ve ID’ed, or buy it.

It distinguishes more than 17,000 objectives — occasions like different species of fresh fruits and bloom, landmarks, tools and so on. The app had little difficulty telling an apple from a( weird-looking) mango, a banana from a plantain and even determined the pistachios I’d told as a snack. Later, in my own testing, I knew it quite useful for identifying the weeds springing up in my vicinity: periwinkles, anemones, wood sorrel, it got them all, though not without the occasional hesitation.

The kicker is that this all happens offline — it’s not sending an likenes over the cadre network or Wi-Fi to a server somewhere to be analyzed. It all happens on-device and within a second or two. Royzen scraped his own image database from various sources and civilized up multiple convolutional neural network use daylights of AWS EC2 compute time.

Then there are far more than that amount in products that it recognizes by speaking the text of the item and querying the Amazon database. It ID’ed journals, a bottle of capsules and other packed goods almost instantaneously, links to buy them. Wikipedia relation pop up if you’re online as well, though a considerable amount of basic descriptions are remained on the device.

On that memorandum, it must be said that SmartLens is a more than 500 -megabyte download. Royzen’s model is huge, since it must keep all the recognition data and offline material right there on the phone. This is a much different approach to the problem than Amazon’s own product acknowledgment engine on the Fire Phone( RIP) or Google Goggles( RIP) or the scan feature in Google Photos( which was pretty useless for situations SmartLens reliably did in half a second ).

” With the various past generations of smartphones containing desktop-class processors and the advent of native machine learning APIs that can harness them( and GPUs ), the hardware exists for a blazing-fast visual search engine ,” Royzen wrote in an email. But none of the large firms you would expect to create one has done so. Why?

The app size and toll on the processor is one thing, for sure, but the edge and on-device processing is where all this trash will go eventually — Royzen is just get an early start. The likely truism is twofold: it’s hard to make money and the quality of the search isn’t high enough.

It must be said at this notes that SmartLens, while smart, is still far from infallible. Its suggestions for what an piece might be are almost always hilariously incorrect for a moment before arriving at, as it often does, the correct answer.

It identified one book I had as” White Whale ,” and no, it wasn’t Moby Dick. An actual whale paperweight it ended was a trowel. Numerous components briefly flashed guessings of” Human being” or “Product design” before getting to a guess with higher confidence. One flowering thicket it identified as four or five different weeds — including, of course, Human being. My monitor was a “computer display,” ” liquid crystal expose ,”” computer check ,” “computer,” ” computer screen ,”” spectacle design” and more. Game controllers were all “control.” A spatula was a wooden spoon( close enough ), with the inexplicable subheading “booby prize.” What ?!

This level of recital( and weirdness in general, however entertaining) wouldn’t be tolerated in a standalone product released by Google or Apple. Google Lens was slow and bad, but it’s just an optional are available in a manipulate, useful app. If it put out a visual inquiry app that marked buds as beings, the company would never sounds the end of it.

And the other side of it is the monetization facet. Although it’s theoretically convenient to be able to crack a picture of a book your friend has and instant prescribe it, it isn’t so much more handy than taking a draw and searching for it afterwards, or merely typing the first few statements into Google or Amazon, which will do the respite for you.

Meanwhile for the user there is still confusion. What can it distinguish? What can’t it marks? What do I need it to mark? It’s meant to ID many things, from hound makes and storefronts, but it likely won’t link, for example, a cool Bluetooth speaker or mechanical watch your friend has, or the inventor of a painting at a local gallery( some paints are accepted, though ). As I utilized it I felt like I was exclusively ever going to use it for a handful of enterprises in which it had proven itself, like determining flowers, but would be hesitating to try it on many other things when I might just be frustrated by some unknown incapability or unreliability.

And hitherto the idea that in the near future there will not be something just like SmartLens is foolish to me. It seems so clearly something the authorities concerned will all take for granted in a few years. And it’ll be on-device , it was not necessary upload your epitome to a server somewhere to be analyzed on your behalf.

Royzen’s app has its editions, but it worked very well in many circumstances and has obvious practicality. The hypothesi that you could object your telephone at the restaurants sector you’re across the street from and consider Yelp re-examine two seconds later — no need to open up a map or kind in an address or call — is an extremely natural expansion of existing pursuit paradigms.

” Visual search is still a niche, but my goal is to give people the experience of a future where one app are available in useful information about anything around them — today ,” wrote Royzen.” Still, it’s inevitable that big companies will launch their emulating gives eventually. My programme is to beat them to grocery as the first universal visual investigation app and amass as many customers as possible so I can stay ahead( or be acquired ).”

My biggest gripe of all, however, is not the capabilities of the app, but in how Royzen has decided to monetize it. Consumers can download it for free but upon opening it are instantly motivated to sign up for a$ 2/ month subscription( though the first month is free-spoken) — before they are unable even see whether the app cultivates or not. If I didn’t already just knowing that the app did and didn’t do, I would delete it without a second thought upon seeing that dialog, and even knowing what I do, I’m not likely to pay in perpetuity for it.

A one-time fee to initiate the app would be more than reasonable, and there’s always the option of referral systems for those Amazon purchases. But asking rent from consumers who haven’t even researched the concoction is a non-starter. I’ve told Royzen my subjects of concern and I hope he reconsiders.

It would also be nice to scan epitomes you’ve already taken, or save epitomes associated with rummages. UI betters like a confidence indicator or some kind of feedback to let you know it’s still is currently working on identification would be nice as well — boasts that are at least theoretically on the way.

In the end I’m impressed with Royzen’s endeavors — when I take a step back it’s amazing to me that it’s possible for a single person, let alone one in senior high school, to put together an app capable of completing such sophisticated computer eyesight enterprises. It’s the kind of( over-) ambitious app-building one expects to come out of a big, lively fellowship like the Google of ten years ago. This is perhaps more of a curiosity than a implement right now, but so were the first text-based search engines.

SmartLens is in the App Store now — make it a shot.

Read more:


8 big proclamations from Google I/ O 2018

Google kicked off its annual I/ O developer conference at Shoreline Amphitheater in Mountain View, California. Here are some of the most important notices from the Day 1 keynote. There is even more to come over the next got a couple of daylights, so follow along on everything Google I/ O on TechCrunch.

Google goes all in on neural networks, rebranding its study division to Google AI

Just before the keynote, Google announced it is rebranding its Google Research division to Google AI. The move signals how Google has increasingly focused R& D on computer imagination, natural language processing, and neural networks.

Google obliges talking to the Assistant more natural with “continued conversation”

What Google announced: Google announced a” continued speech” revise to Google Assistant that stirs talking to the Assistant feel most natural. Now, instead of having to say “Hey Google” or “OK Google” every time you want to say a dominate, you’ll simply “re going to have to” do so the first time. The firm also is adding a brand-new peculiarity that allows you to ask multiple questions within the same petition. All this will roll out in the coming weeks.

Why it’s important : When you’re having a usual conference, stranges are you are asking follow-up interrogates if you didn’t get the answer you missed. But it can be jarring to have to say ” Hey Google” every single season, and it separates the whole overflow and clears the process appear fairly unnatural. If Google wants to be a significant participate when it is necessary to singer interfaces, the actual interaction has to feel like a dialogue — not just a series of queries.

Google Photos gets an AI raise

What Google announced: Google Photos already establishes it easy for you to correct photos with built-in editing implements and AI-powered aspects for automatically composing collages, movies and stylized photos. Now, Photos is getting more AI-powered sticks like B& W photo colorization, brightness chastening and suggested spins. A brand-new form of the Google Photos app will hint quick fix and nips like rotations, brightness corrections or adding papas of color.

Why it’s important : Google is working to become a hub for all of your photos, and it’s able to woo potential consumers by offering powerful tools to revise, sort, and modify those photos. Each additional photo Google gets offers it more data and helps them get better and better at portrait recognition, which in the end not only improves the user ordeal for Google, but also becomes its own tools for its services better. Google, at its nature, is a exploration corporation — and it needs a lot of data to get visual examine right.

Google Assistant and YouTube are coming to Smart Displays

What Google announced : Smart Displays were the talk of Google’s CES push this year, but we haven’t heard much about Google’s Echo Show competitor since. At I/ O, we got a little more insight into the company’s smart expose attempts. Google’s first Smart Displays will propel in July, and of course is likely to be powered by Google Assistant and YouTube. It’s clear that the company’s expended some resources into building a visual-first form of Assistant, apologizing the additive of a screen to the experience.

Why it’s important: Users are increasingly going accustomed to the idea of some smart design sitting in their living room that will answer their questions. But Google is looking to create a system where a consumer can ask questions and then have an option to have some kind of visual display for actions that only can’t be resolved with a voice interface. Google Assistant treats the tone part of that equation — and having YouTube is a good service that leads alongside that.

Google Assistant is coming to Google Maps

What Google announced : Google Assistant is coming to Google Maps, may be consulted in iOS and Android the summer months. The addition is meant to provide better recommendations to users. Google has long is endeavouring to oblige Maps seem more personalized, but since Maps is now about far more than merely directions, the company is innovating brand-new peculiarities to give you better recommendations regarding local places.

The delineates integrating likewise blends the camera, computer image technology, and Google Maps with Street View. With the camera/ Maps combination, it actually looks like you’ve climbed inside Street View. Google Lens can do concepts like relate houses, or even dog makes, exactly by objecting your camera at the object in question. It will also be able to identify text.

Why it’s important: Maps is one of Google’s biggest and most important makes. There’s a lot of exhilaration around augmented reality — they are able to point to phenomena like Pokemon Go — and companies are just starting to scratch the surface of the proper use examples for it. Figuring out attitudes seems like such a natural apply lawsuit for a camera, and while it was a bit of a technical feat, it commits Google yet another perk for its Maps users to keep them inside the service and not switch over to alternatives. Again, with Google, everything coming through to the data, and it’s able to capture more data if consumers stick around in its apps.

Google announces a new generation for its TPU machine learning hardware

What Google announced : As the crusade for creating customized AI hardware heats up, Google said that it is rolling out its third generation of silicon, the Tensor Processor Unit 3.0. Google CEO Sundar Pichai said the brand-new TPU is 8x more powerful than last year per pod, with up to 100 petaflops in accomplishment. Google connects pretty much every other major companionship in looking to create habit silicon in order to handle its machine operations.

Why it’s important: There’s a race to create best available machine learning tools for developers. Whether that’s at the framework level with tools like TensorFlow or PyTorch or at the actual hardware rank, the company that’s be permitted to fasten developers into its ecosystem will have an advantage over the its competitors. It’s especially important as Google ogles to build its cloud platform, GCP, into a massive business while going up against Amazon’s AWS and Microsoft Azure. Giving developers — who ever adopting TensorFlow en masse — a practice to speed up their operations can help Google continue to woo them into Google’s ecosystem.

MOUNTAIN VIEW, CA- MAY 08: Google CEO Sundar Pichai gives the keynote address at the Google I/ O 2018 Conference at Shoreline Amphitheater on May 8, 2018 in Mountain View, California. Google’s two day developer conference passes through Wednesday May 9.( Photo by Justin Sullivan/ Getty Images)

Google News gets an AI-powered redesign

What Google announced : Watch out, Facebook. Google is also planning to leveraging AI in a revamped form of Google News. The AI-powered, redesigned word end app will” allow users to keep up with the word they care about, understand the full story, and enjoy and support the publishers they trust .” It will leverage ingredients found in Google’s digital publication app, Newsstand and YouTube, and introduces brand-new boasts like “newscasts” and “full coverage” to help people get a summary or a more holistic idea of a news story. Why it’s important : Facebook’s central concoction is literally called ” News Feed ,” and it helps as a major source of information for a non-trivial section of countries around the world. But Facebook is embroiled in a gossip over personal data of as many as 87 million useds resolving up in the handwritings of a political research conglomerate, and there are a lot of questions over Facebook’s algorithms and whether they surface up lawful datum. That’s a huge hole that Google could exploit by offering a better bulletin concoction and, is again, fastening consumers into its ecosystem.

Google unveils ML Kit, an SDK that obligates it easy to add AI smarts to iOS and Android apps

What Google announced : Google unveiled ML Kit, a brand-new application development kit for app developers on iOS and Android that allows them to integrate pre-built, Google-provided machine learning representations into apps. The frameworks support text acknowledgment, look detecting, barcode scan, portrait labeling and landmark recognition.

Why it’s important: Machine learning tools have enabled a new wave of use occurrences that include use subjects built on top of persona recognition or pronunciation detection. But even though frameworks like TensorFlow have seen it easier to build applications that tap those tools, it can still take a high level of expertise to get them off the floor and guiding. Developers often figure out the best use subjects for new implements and devices, and change paraphernaliums like ML Kit assistance lower the barrier to entry and give developers without one tonne of expertise in machine learning a playground to start figuring out fascinating apply subjects for those appliocations.

So when will you be able to actually play with all these brand-new peculiarities? The Android P beta is available today, and you can find the upgrade here .

Read more: