Want to create and maintain a popular proxy website? Or just a website in general and/or have almost near perfect visibility across search engines and easy discovery to a discord server?
Before beginning with the source, you need to understand where to look. First off to those who are worried about SEO and web proxies hosting with keyword concerns (keywords can be used to source block); you need to only apply this to an official domain. Do you really think you can have thousands of domains retain equal SEO? No.
The idea is this:
First, create a way to serve an SEO source to your official domain that ENDS in .net/.dev/.com/.org/.co (if your domain does not use these TLDs forget about getting good SEO. .app is possible but harder unless you host a popular site)
This serving can be done either through server-side functions by well checking if the domain is the official site and serving up the exclusive SEO source, considering fancy str.replace methods (cough cough source randomization) or well straight up have two sources that you host separately
You might ask, is this worth it? Wouldn’t this be harder to maintain or set up? Well sadly if you wish to utilize competitive Search Engine Optimization you need to apply one of the serving methods in order to have a perfect source where you don’t need to worry about keywords. Your official domain is going to get blocked almost instantly BUT it still serves as a gateway to your backlinks such as documentation, discord, social media and GitHub. It creates massive popularity. You might even ask yourself who is going to search this up? PEOPLE will search and SEO influences so much more than your classic distribution.
Starting Steps
Create a search console account for Google: https://search.google.com/search-console/about
Google should be your priority. You first need to actually get your site indexed and understand what Google is looking for. This tool will TELL you everything you need to know. Sure you will need to nerd about a bit but the steps below will make more sense once you see this console. On the actual console page ENSURE to type out the absolute path to index; example is “https://holyunblocker.net/” with your fancy /. Do this for every single page with the URL Inspection tool. Remember your paths need to be server side not straight up markdown files. Check the performance tab once your site gains users to see if your keywords and well below you will see if your descriptors are leaving good impressions on Google. With good SEO you should hold a spot from either 1-6.
Create a webmasters account for Bing: https://www.bing.com/webmasters/about
Same concept above but for Bing. Bing is a bit more uh basic therefore focus on Google. Bing tends to eat off it anyways. Ensure that your source specifically with both NGINX and the actual source specifies Bing if you care more about it. Remember though less users. Regardless Bing tends to be a lot less picky when compared to Google. Everything from rich results, etc. is easier. Remember Bing is used by many search engines such as DuckDuckGo
Read over creating SCHEMAs JSON Modals:
Want to understand how to create “rich snippets” or results on Google? This is how. I am talking about those FAQ questions or fancy structures that you see for massive sites linking to other pages or just having questions. YOU can control it all. There are core attributes and structure priorities that you must follow. I give examples in my source below and consider re-using it. Use the rich results tool Google provides to test how it looks
Create robots.txt and ensure it is routed correctly: https://yoast.com/ultimate-guide-robots-txt/
Touching into creating a source console account you will notice this is frequently talked about. You need to specify specific paths and also block paths. DON’T JUST WHITELIST EVERYTHING. All search engines care about detailed robots.txt files. Will it downrank you? No, but having a solid understanding of user agents gives you an edge over most basic sites.
Create a properly formatted sitemap even with poor paths but consider RICH paths for your service
For this I could type out a guide but that is rather unneeded. I will instead upload an example. Notice how I specify the priority of content rich pages of importance that way Google or Bing can understand and create “Rich Results” better. Sometimes Google will ignore this however it is apart of the old web but still an essential SEO resource to have. This file will be in your root directory for served content. You will submit this to the Search Console for both Google and Bing.
Creating and maintaining backlinks: Below, I will provide source linking examples. Think about a spider web. You don’t want just your site’s top ranking but the entire spider web on that Google/Bing page
Before we even start touching the source, you need to make sure you have some sort of social media accounts and documentation. Having your project open-source is one of the easiest ways to maintain backlinks. Ensure your README is in-depth and full of important descriptors. Provide screenshots or even videos of your service this will all be indexed. Google and Bing will naturally take any GitHub results and index them building up your backlinks. Create social media accounts and upload on them. Or don’t just ensure the descriptors are unique enough and link it on your README. THIS IS ESSENTIAL. If you don’t do this you might as well be wasting your time with all these keyword changes. Without backlinks you cannot have a popular service.
Useful backlinks: GitHub, Docusaurus on a subdomain like “docs”, Patreon account, YouTube account, Twitter account, Open Collective account, XDA Forums post, Quora post, Discord server with vanity or non-vanity, credible sponsors such as Medium or organizations, other projects on GitHub or wikis with your unique project name or optimized official site (optimized as in source)
Understanding Cache fundamentals and timestamping assets:
Below, I will show an example of the importance of a basic web security setup. TTL cache is a core metric for search engines to both crawl any updated content you have but also index faster. Save yourself and your users pain by ensuring your assets update fast enough. Markdown (HTML) will be nearly instant, but stylesheets and other scripts will take time. That is where timestamping comes into play.
For my example, I forced all my caching to Cloudflare, which acted as my CDN by specifying the proxy_cache_bypass directive with NGINX. You can use a different CDN or cache everything on your own server if you have a nice instance. Utilize the gzip directive fully.
Maintain a properly distributed NGINX reverse proxy setup and correct directives (will provide examples):
THIS IS SUPER IMPORTANT. I’m talking HTTP2, TLS, SSL, SECURITY HEADERS, ADDED HTTP HEADERS
First off, use NGINX, and if you wish, you can use CF for caching or whatever CDN you need. Maintain NGINX as your initial reverse proxy setup. Many people don’t understand the importance of modern web security and how Google or Bing actually account for this. They check for specific security headers, proper TLS/SSL settings, and fingerprinting. Above and below, I am providing various examples of what you should consider adding, as well as my own server example from when I was hosting Holy Unblocker
My source. I realize this is a wall, but I actually commented on every essential thing you could possibly need. This is also a good structure for those wanting a fast reverse proxy setup. I include all SEO essential nginx directives. I also include error pages. Note my formatting. Error pages are essential as well, although I could not tell you the SEO impact. It allows for a more rich server setup.
Source Structure
Overview
Rich and well-structured web content is obviously one of the most important factors for SEO, in addition to optimizing server-side changes and the steps mentioned above. This section will be split into both the core head tag structuring and the body tag.
Below, I list your primary focuses and the tricks I have learned along the way. Perhaps seeing is better than reading, so I also provide my entire optimized source with explanations:
Rich Keywords -
Not the keywords meta name attribute but actually ensuring the majority of your site has relevant keywords related to the description meta name attribute. Consider actually looking up some of these keywords and viewing what other sites use for this section or note competitor keyword usage on their sites. Consider using the example above if your service shares a similar niche.
Notice the parallel keywords throughout my examples below. Currently, in 2024, most search engines will ignore the keyword name attribute, but it does have an impact on some search engines still and general structure. You want that universal ease for Google, Bing, and whatever. For the description, I ensured that it would actually be properly embedded on both Google and Bing. Usually 150 characters or so.
Readable Structure -
You are not using a framework that makes things unreadable. I hope not. Regardless these factors still apply. Have your served content be clean enough to read; that way, you can actually focus on the richness of included keywords. You want your site to be as textbook as possible source-wise.
Accessibility -
Google in particular (your primary focus) cares big time about accessibility. This factor you could argue is actually probably the second most important thing with this entire guide. Google will rank your site poorly if it has language or readability issues. This entire area includes font size, CLS (Content Layout Shifts), source by the book responsiveness, etc. I will explain below that you can kinda “cheat” following regulations on mobile support. HOWEVER, having real mobile support is naturally great for a popular service.
A very good tool to use for this is Google’s own Page Insights utility or Lighthouse. Utilize both to your liking.
Performance -
Notice the optimization throughout the source. Leave animations to frameworks or JavaScript, as most modern browsers are not very performant with CSS animations or keyframes. Keep things clean and minimal. This is a strong metric for Google search ranking. If your site is laggy or has network issues, consider a proper hosting provider and/or a CDN. Lighthouse and generic devtools are a good way to analyze the paint time for your entire site and break down which assets are causing issues. Remember core practices such as always ensuring fonts are hosted on external CDNs and other similar assets.
Overall Sitemap -
This guide is not just saying you need this file or saying you need to organize your project folders in a certain way. It doesn’t matter if this is done with Nginx or whatever, although having folder paths is preferable on Google. (Still, be organized)
By sitemap, this guide talks about how you link to other pages on your navbar. Your navbar is incredibly important for actually contributing to the other pillars above. You want each page to retain the same readable structure and accessibility but you need to ensure you switch up core keywords to give Google a reason to index this page. An example is having descriptions separating the web proxies page from the home page or games page blah blah. You get it.
This overall connects back to a readable structure by using the h1, h2, and so forth tags. Your core HTML tags are favored more by search engines in comparison to CSS structuring when it comes to markdown. Ensure you utilize these tags but still feel free to depend on CSS for every other design factor.
Head Tag (SEO)
Body Tag (SEO - NAVBAR)
It is essential that your sitemap.xml file and navbar match accordingly with page priority if you want to have correct Rich Results on Google or Bing. Ensuring you have clean and rich page names is essential. For example, I could have called the Web Proxies page just Proxies or something bland like Surf. However, focusing on core keywords, I decided to stick to Web Proxies. The same concept applies throughout the site for every page. Docs will be lengthened to Documentation, etc.
Body Tag (SEO - Content)
The main takeaway here is my use of the (be sure not the abuse keywords):
title attribute for deeper descriptions (essentially on hover after some time this information will appear over some elements; remember what I said about accessibility)
Proper HTML markdown structuring for each respective element. Headers are headers and text is text. Code tag is code.
alt attribute on images for accessibility
span attribute for clarity when it comes to crawling
Descriptive class names for stylesheets (ranking factor) or proper use of a framework (weirdly using a framework helps with SEO but for those vanilla site users you can have your own source remember; just keep the names descriptive and clean)
Rich keyword usage relevant to the site which can help support backlinks
Body Content (SEO - FOOTER + SOCIALS)
This section might be the third most important factor. Properly setting up your backlinks is essential, as a combination of both the navbar and the overall footer can help. Modern web design has the stereotype of having socials in the footer. For an open-source project, this can include a lot more than socials to help build up that spider web.
The main takeaways here are to remember the use:
Proper HTML structure as stated before to maintain. Header tags, list tags, and anchors are used correctly
Rich keywords are used again related to not just the brand but also various socials or, in this case, mostly open-source assets used. This method creates many backlinks further boosting site and project visibility
Featuring linked socials that is readable
Restating the obvious brand with a copyright charset