What are the activity_id formats for various platforms?

Admiral Patrick@lemmy.world · edit-2 9 hours ago

What are the activity_id formats for various platforms?

julian@activitypub.space · 4 hours ago

admiralpatrick@lemmy.world I think you would be better served by checking for the Link header. NodeBB and WordPress do it, if that gives you some idea of implementation?

julian@activitypub.space · edit-2 4 hours ago

It took me a minute to find, but it is detailed in evan@cosocial.ca’s write up about HTTP Discovery of ActivityPub Objects.

This is probably exactly what you’re looking for.

https://swicg.github.io/activitypub-html-discovery/

I think your current approach has merit but is limited. If you know the instance software by URL and can resolve it using path matching without the use of a pre-flight request, that’s absolutely a better way forward. The downside is you have to know the URL patterns of every software. You’ll never “catch 'em all”!

However, if that method fails, doing a pre-flight check to grab Link also works and is a viable way forward.

You can test against NodeBB users or posts.

Jayjader@jlai.lu · edit-2 8 hours ago

From my own experience querying public mastodon timelines via API (edit: removed incorrect /api/v1s in the AP_IDs):

Mastodon user accounts have an ActivityPub URI of https://<instance.domain.tld>/users/<username>
Mastodon posts have an ActivityPub URI of https://<instance.domain.tld>/users/<post_author_username>/statuses/<post_id> (they also have a url property of https://<instance.domain.tld>/@<post_author_username>/<post_id> but that tends to serve the html view of the post)

To see for yourself, pick an instance that allows viewing their public timeline without logging in (mastodon.social is perfect for this) and follow the “Playing with public data” section of the docs. That page ellides most of the info you’re looking for in the example payloads they give (as the JSON payloads themself are quite large and nested), but I can assure you that AP_IDs for user accounts and posts can be found pretty quickly from a single timeline query.

I don’t think Mastodon has any notion of community, nor does it distinguish between posts and comments (when following a lemmy community, both posts and comments show up in my masto feed as “top-level” statuses (ie posts)).

Admiral Patrick@lemmy.world · 7 hours ago

Cool, thanks. I was close with /user guessing from memory.

I think the /users/.../post_id will be sufficient. It just needs to know that the given URL is an AP_ID before passing it off to the API call to resolveObject. Since it already knows instance.domain.tld is a federated instance, it just needs to see if the path is an AP_ID or the HTML (or something else). Thus, I don’t have to parse the whole thing, just check that enough of it matches.

Thanks!

rglullis@communick.news · 7 hours ago

So, I’ve rewritten the search / search boxes in Tesseract to skip the search and directly resolve activity pub URLs for users, posts, comments, and communities. I’m loving this as it makes things so much faster and easier.

Isn’t that the whole point of webfinger? Moreover, why would you paint yourself into a corner and hardcode the logic for all the different types of services, if ActivityPub uses JSON-LD and therefore provides a straightforward method for document dereferencing?

I’m not trying to be snarky. It’s just that I’m writing ActivityPub server where the id of each object is just an ULID, because to the server there is zero difference between serving the information about an actor or an activity.

Admiral Patrick@lemmy.world · 6 hours ago

We’ve had this discussion :)

This application is written against the Lemmy API. It only speaks API. Eventually it’ll speak Piefed API as well, but right now, only Lemmy API.

Lemmy and Piefed only do server-to-server Activity Pub and not client-to-server AP. Clients have to use the API to interact with them. This is a Lemmy (and eventually Piefed) client.

rglullis@communick.news · 5 hours ago

But then why do you worry about the ap_id patterns from other software?

Admiral Patrick@lemmy.world · 4 hours ago

I’m making an “omnisearch” box.

Paste in an AP_ID into the search field, and it auto-resolves it and redirects you to your instance’s local copy (which is very fast) instead of going through the whole search process (which is slow). To prevent false positives, I’m matching the various ap_id formats and only doing the resolution on those; anything else gets passed to search.

Anything else that falls through the cracks just gets passed to search as usual (which also does a resolveObject lookup).

It’s to make life easier.