I'm using
node-http-proxy to run a proxy website. I would like to proxy any target website that the user chooses, similarly to what's done by https://www.proxysite.com/, https://www.croxyproxy.com/ or https://hide.me/en/proxy.
How would one achieve this with `node-http-proxy`?
What I have tried:
## Idea #1: use a `?target=` query param.
My first naive idea was to add a query param to the proxy, so that the proxy can read it and redirect.
Code-wise, it would more or less look like (assuming we're deploy this to http://myproxy.com):
const BASE_URL = 'https://myproxy.com';
async function handler(
req: NextApiRequest,
res: NextApiResponse
): Promise<void> {
try {
const url = new URL(req.url, BASE_URL);
const targetURLStr = url.searchParams.get('target');
return httpProxyMiddleware(req, res, {
changeOrigin: true,
target: targetURLStr,
});
} catch (err) {
res.status(500).json({ error: (err as Error).message });
}
}
**Problem**: If I deploy this code to myproxy.com, and load `https://myproxy.com?target=https://google.com`, then google.com is loaded, but:
- if I click a link to google images, it loads `https://myproxy.com/images` instead of `https://myproxy.com?target=https://google.com/images`, also see https://stackoverflow.com/questions/70383955/url-as-query-param-in-proxy-how-to-navigate
## Idea #2: use cookies
Second idea is to read the `?target=` query param like above, store its hostname in a cookie, and proxy all resources to the cookie's hostname.
So for example user wants to access https://google.com/a/b?c=d via the proxy. The flow is:
- go to `https://myproxy.com?target=${encodeURIComponent('https://google.com/a/b?c=d')}`
- proxy reads the `?target=` query param, sets the hostname (`https://google.com`) in a cookie
- proxy redirects to https://myproxy.com/a/b?c=d (307 redirect)
- proxy sees a new request, and since the cookie is set, we proxy this request into `node-http-proxy` using cookie's target.
Code-wise, it would look like: https://gist.github.com/throwaway34241/de8a623c1925ce0acd9d75ff10746275
**Problem:** This works very well. But only for one proxy at a time. If I open one browser tab with `https://myproxy.com?target=https://google.com`, and another tab with `https://myproxy.com?target=https://facebook.com`, then:
- first it'll set the cookie to https://google.com, and i can navigate in the 1st tab correctly
- then I go to the 2nd tab (without closing the 1st one), it'll set the cookie to https://facebook.com, and I can navigate facebook on the 2nd tab correctly
- but then if I go back to the first tab, it'll proxy google resources through facebook, because the cookie has been overwritten.
I'm a bit out of ideas, and am wondering how those generic proxy websites are doing. Ideally, I would not want to parse the HTML of the target website.