{"id":15807,"date":"2026-04-28T16:36:48","date_gmt":"2026-04-28T14:36:48","guid":{"rendered":"https:\/\/www.palentino.es\/blog\/?p=15807"},"modified":"2026-04-28T16:37:09","modified_gmt":"2026-04-28T14:37:09","slug":"%f0%9f%a7%a0-podra-tu-gpu-mover-ese-modelo-de-ia","status":"publish","type":"post","link":"https:\/\/www.palentino.es\/blog\/%f0%9f%a7%a0-podra-tu-gpu-mover-ese-modelo-de-ia\/","title":{"rendered":"&#x1f9e0; \u00bfPodr\u00e1 tu GPU mover ese modelo de IA?"},"content":{"rendered":"<p data-start=\"207\" data-end=\"275\">Cada vez que vas a descargar un modelo de IA local aparece lo mismo:<\/p>\n<p data-start=\"277\" data-end=\"297\"><strong data-start=\"277\" data-end=\"297\">BF16, FP8, INT4\u2026<\/strong><\/p>\n<p data-start=\"299\" data-end=\"378\">Y la gran duda:<br data-start=\"314\" data-end=\"317\" \/>&#x1f449; <em data-start=\"320\" data-end=\"378\">\u00bfEsto lo podr\u00e1 mover mi equipo o se va a quedar colgado?<\/em><\/p>\n<p data-start=\"299\" data-end=\"378\"><!--more--><\/p>\n<hr data-start=\"380\" data-end=\"383\" \/>\n<h3 data-section-id=\"13ihyvg\" data-start=\"385\" data-end=\"437\">&#x1f50d; Primero: qu\u00e9 significan realmente esas siglas<\/h3>\n<p data-start=\"439\" data-end=\"517\">Son <strong data-start=\"443\" data-end=\"465\">formatos num\u00e9ricos<\/strong> que definen c\u00f3mo se almacenan los pesos del modelo.<\/p>\n<p data-start=\"519\" data-end=\"573\">&#x1f449; En simple:<br data-start=\"532\" data-end=\"535\" \/><strong data-start=\"535\" data-end=\"573\">cu\u00e1nta informaci\u00f3n usa cada n\u00famero<\/strong><\/p>\n<ul data-start=\"575\" data-end=\"663\">\n<li data-section-id=\"17tny55\" data-start=\"575\" data-end=\"614\">M\u00e1s bits \u2192 m\u00e1s precisi\u00f3n \u2192 m\u00e1s VRAM<\/li>\n<li data-section-id=\"1vizfde\" data-start=\"615\" data-end=\"663\">Menos bits \u2192 menos precisi\u00f3n \u2192 menos consumo<\/li>\n<\/ul>\n<hr data-start=\"665\" data-end=\"668\" \/>\n<h3 data-section-id=\"1ths7js\" data-start=\"670\" data-end=\"728\">&#x2699;&#xfe0f; Los formatos m\u00e1s comunes (de mayor a menor consumo)<\/h3>\n<ul data-start=\"730\" data-end=\"921\">\n<li data-section-id=\"1e9l28s\" data-start=\"730\" data-end=\"796\"><strong data-start=\"732\" data-end=\"740\">FP32<\/strong> \u2192 m\u00e1xima precisi\u00f3n (entrenamiento, inviable en local)<\/li>\n<li data-section-id=\"1ux8qnn\" data-start=\"797\" data-end=\"834\"><strong data-start=\"799\" data-end=\"814\">FP16 \/ BF16<\/strong> \u2192 est\u00e1ndar actual<\/li>\n<li data-section-id=\"1xa8q97\" data-start=\"835\" data-end=\"866\"><strong data-start=\"837\" data-end=\"851\">FP8 \/ INT8<\/strong> \u2192 equilibrio<\/li>\n<li data-section-id=\"1t2lyu8\" data-start=\"867\" data-end=\"921\"><strong data-start=\"869\" data-end=\"883\">FP4 \/ INT4<\/strong> \u2192 ultra comprimido (ideal para local)<\/li>\n<\/ul>\n<hr data-start=\"923\" data-end=\"926\" \/>\n<h3 data-section-id=\"n70nj3\" data-start=\"928\" data-end=\"960\">&#x1f4a1; \u00bfQu\u00e9 es exactamente BF16?<\/h3>\n<p data-start=\"962\" data-end=\"1020\">BF16 (Brain Floating Point) usa 16 bits, pero optimizados:<\/p>\n<p data-start=\"1022\" data-end=\"1135\">&#x2714; Mantiene un rango amplio de valores<br data-start=\"1059\" data-end=\"1062\" \/>&#x2714; Reduce coste respecto a FP32<br data-start=\"1092\" data-end=\"1095\" \/>&#x2714; Es el formato base en muchos modelos<\/p>\n<p data-start=\"1137\" data-end=\"1197\">&#x1f449; Por eso muchos modelos vienen en <strong data-start=\"1173\" data-end=\"1197\">BF16 \u201csin comprimir\u201d<\/strong><\/p>\n<hr data-start=\"1199\" data-end=\"1202\" \/>\n<h3 data-section-id=\"1ymxsfh\" data-start=\"1204\" data-end=\"1246\">&#x26a1; El truco r\u00e1pido que te ahorra tiempo<\/h3>\n<p data-start=\"1248\" data-end=\"1302\">&#x1f449; <strong data-start=\"1251\" data-end=\"1302\">Par\u00e1metros del modelo \u00d7 factor \u2248 VRAM necesaria<\/strong><\/p>\n<p data-start=\"1304\" data-end=\"1325\">Factores aproximados:<\/p>\n<ul data-start=\"1327\" data-end=\"1382\">\n<li data-section-id=\"338jqr\" data-start=\"1327\" data-end=\"1340\">BF16 \u2192 x2<\/li>\n<li data-section-id=\"160ihsp\" data-start=\"1341\" data-end=\"1360\">FP8 \/ INT8 \u2192 x1<\/li>\n<li data-section-id=\"17sxldt\" data-start=\"1361\" data-end=\"1382\">FP4 \/ INT4 \u2192 x0,5<\/li>\n<\/ul>\n<hr data-start=\"1384\" data-end=\"1387\" \/>\n<h3 data-section-id=\"b1my7z\" data-start=\"1389\" data-end=\"1408\">&#x1f4ca; Ejemplo real<\/h3>\n<p data-start=\"1410\" data-end=\"1439\">Modelo de <strong data-start=\"1420\" data-end=\"1438\">35B par\u00e1metros<\/strong>:<\/p>\n<ul data-start=\"1441\" data-end=\"1494\">\n<li data-section-id=\"5xwjnp\" data-start=\"1441\" data-end=\"1458\">BF16 \u2192 ~70 GB<\/li>\n<li data-section-id=\"1slq64p\" data-start=\"1459\" data-end=\"1475\">FP8 \u2192 ~35 GB<\/li>\n<li data-section-id=\"13ab7lo\" data-start=\"1476\" data-end=\"1494\">FP4 \u2192 ~17,5 GB<\/li>\n<\/ul>\n<p data-start=\"1496\" data-end=\"1544\">&#x1f449; Aqu\u00ed ya ves por qu\u00e9 la cuantizaci\u00f3n es clave.<\/p>\n<hr data-start=\"1546\" data-end=\"1549\" \/>\n<h3 data-section-id=\"19f9dqb\" data-start=\"1551\" data-end=\"1593\">&#x26a0;&#xfe0f; Ojo (esto es lo que muchos olvidan)<\/h3>\n<p data-start=\"1595\" data-end=\"1645\">Ese c\u00e1lculo <strong data-start=\"1607\" data-end=\"1644\">solo incluye los pesos del modelo<\/strong>.<\/p>\n<p data-start=\"1647\" data-end=\"1686\">En uso real necesitas m\u00e1s memoria para:<\/p>\n<ul data-start=\"1688\" data-end=\"1744\">\n<li data-section-id=\"1ab8rth\" data-start=\"1688\" data-end=\"1723\">KV Cache (memoria del contexto)<\/li>\n<li data-section-id=\"uxiyr1\" data-start=\"1724\" data-end=\"1744\">buffers internos<\/li>\n<\/ul>\n<p data-start=\"1746\" data-end=\"1796\">&#x1f449; Consejo: <strong data-start=\"1758\" data-end=\"1796\">a\u00f1ade margen o fallar\u00e1 \/ ir\u00e1 lento<\/strong><\/p>\n<hr data-start=\"1798\" data-end=\"1801\" \/>\n<h3 data-section-id=\"1uovcwn\" data-start=\"1803\" data-end=\"1826\">&#x1f3af; Conclusi\u00f3n clara<\/h3>\n<ul data-start=\"1828\" data-end=\"1909\">\n<li data-section-id=\"1yelnq2\" data-start=\"1828\" data-end=\"1863\">BF16 = m\u00e1s calidad, m\u00e1s consumo<\/li>\n<li data-section-id=\"d2bk0w\" data-start=\"1864\" data-end=\"1884\">FP8 = equilibrio<\/li>\n<li data-section-id=\"1429d6s\" data-start=\"1885\" data-end=\"1909\">INT4 = m\u00e1ximo ahorro<\/li>\n<\/ul>\n<p data-start=\"1911\" data-end=\"1995\">&#x1f449; Entender esto te permite decidir en segundos si un modelo es viable en tu equipo.<\/p>\n<hr data-start=\"1997\" data-end=\"2000\" \/>\n<p data-start=\"2002\" data-end=\"2108\">&#x1f4ac; Si trabajas con IA en local, este c\u00e1lculo te puede ahorrar horas de pruebas\u2026 y bastantes frustraciones.<\/p>\n<p data-start=\"2002\" data-end=\"2108\"><a href=\"https:\/\/www.palentino.es\/blog\/wp-content\/uploads\/2026\/04\/Calculo-Memoria-ModeloIA-palentino.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-15809\" src=\"https:\/\/www.palentino.es\/blog\/wp-content\/uploads\/2026\/04\/Calculo-Memoria-ModeloIA-palentino.png\" alt=\"\" width=\"1024\" height=\"1536\" srcset=\"https:\/\/www.palentino.es\/blog\/wp-content\/uploads\/2026\/04\/Calculo-Memoria-ModeloIA-palentino.png 1024w, https:\/\/www.palentino.es\/blog\/wp-content\/uploads\/2026\/04\/Calculo-Memoria-ModeloIA-palentino-200x300.png 200w, https:\/\/www.palentino.es\/blog\/wp-content\/uploads\/2026\/04\/Calculo-Memoria-ModeloIA-palentino-683x1024.png 683w\" sizes=\"auto, (max-width: 1024px) 100vw, 1024px\" \/><\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Cada vez que vas a descargar un modelo de IA local aparece lo mismo: BF16, FP8, INT4\u2026 Y la gran duda:&#x1f449; \u00bfEsto lo podr\u00e1 mover mi equipo o se va a quedar colgado?<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[1415],"tags":[],"class_list":["post-15807","post","type-post","status-publish","format-standard","hentry","category-sin-categoria-es"],"_links":{"self":[{"href":"https:\/\/www.palentino.es\/blog\/wp-json\/wp\/v2\/posts\/15807","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.palentino.es\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.palentino.es\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.palentino.es\/blog\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/www.palentino.es\/blog\/wp-json\/wp\/v2\/comments?post=15807"}],"version-history":[{"count":2,"href":"https:\/\/www.palentino.es\/blog\/wp-json\/wp\/v2\/posts\/15807\/revisions"}],"predecessor-version":[{"id":15811,"href":"https:\/\/www.palentino.es\/blog\/wp-json\/wp\/v2\/posts\/15807\/revisions\/15811"}],"wp:attachment":[{"href":"https:\/\/www.palentino.es\/blog\/wp-json\/wp\/v2\/media?parent=15807"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.palentino.es\/blog\/wp-json\/wp\/v2\/categories?post=15807"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.palentino.es\/blog\/wp-json\/wp\/v2\/tags?post=15807"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}